Answer 1. a. semi structured data
A Form of data which is not structured in the form of tables or conventional relational databases formats. A semi-structured data model is a model where the data can typically be organized as some kind of tree or graph and the contents of the nodes in the tree or graph are not necessarily strictly typed. The nodes here could be tags. Example : <Students> <Student> <name>Bob</name> <id>777777777</id> </Student> </Students>
b. DTD <!Element> tag
It is a tag which is used to define XML elements which are building blocks of any XML document. These can contain text, other elements, media, attributes, links, etc. These elements appear as tags for the XML document. A simple format to use this tag is defined below: <!ELEMENT element_name (list of names of child elements with their required necessity)> We can also provide SYSTEM location and the NotationName while defining an element. Example : (Example for both b and c is after c)
c. DTD <!ATTLIST> tag
It is a tag which is used to define the attributes list for a particular element which is defined in the DTD. A simple format to use this tag is given below: <!ATTLIST element_name attribute_name_1 attribute_type [default_value] ... attribute_name_n attribute_type [default_value]> Different attributes can be of different data types like CDATA, PCDATA, a set of values, etc. The necessity of an attribute for the element can also be defined here, e.g. whether one is required/optional/etc.
Example for b and c part: <!DOCTYPE midterm [ <!ELEMENT midterm (card*) > <!ELEMENT card (firstname) > <!ELEMENT firstname (#PCDATA)> <!ATTLIST card ID CDATA #REQUIRED> ]>
A sample XML:
<midterm> <card ID="013931476"> <firstname>Sida</firstname> </card> <card ID="013931477"> <firstname>Ishaan</firstname> </card> <card ID="013931478"> <firstname>Parth</firstname> </card> </midterm>
GROUP : Ishaan, Sida, Parth(Edited: 2020-10-05)
Predicates that are suitable for describing movies:
Director Title Actor IMDbRating Budget Duration Year
Document in N3 notation
RockNRolla Director “Guy Ritchie”; Title “RocknRolla”; Actor “Gerard Butler”, “Tom Hardy”, “Thandie Newton”; Duration 114; Year 2008.
Example SPARQL query: find all movies (title and movie) starring "Tom Hardy"
PREFIX mv: http://xmlns.com/mymoviesont.com SELECT ?title ?year WHERE { ?movie mv:Title ?title ?movie mv:Actor “Tom Hardy” ?movie mv:Year ?year }
Question 5
Example xml file ---------------------------------------- <business_card> <name>Bob Bobber </name> <company position="ceo">Bob's Burgers </company> <address>777 San Jose, CA </address> <email>bob@burger.com</email> <phone_number> 777-7777 </phone_number> </business_card>
Schema Oblivious mapping table creation --------------------------------------- CREATE TABLE NODE( ID CHAR(6) NOT NULL PRIMARY KEY, PARENT_ID CHAR(6), TYPE VARCHAR(9), LABEL VARCHAR(20), VALUE TEXT, FOREIGN KEY (PARENT_ID) REFERENCES NODE(ID) ); CONSTRAINT CC1 CHECK(TYPE IN ("element", "attribute")) )
PostgreSQL example of first insert. ------------------------------------ INSERT INTO NODE(id, parent_id, "type", "label", "value") VALUES (1, NULL, 'element', business_card, NULL);
Example Table ----------------------------------------------- ID Parent_ID Type Label Value 1 NULL element business_card NULL 2 1 element name Bob Bobber 3 1 element company Bob's Burgers 4 3 attribute position ceo 4 1 element address 777 San Jose, CA 5 1 element email bob@burger.com 6 1 element phone_number 777-7777(Edited: 2020-10-05)
1. Update the local node's heartbeat state (the version) and constructs the node's local view of the cluster gossip endpoint state. 2. Pick a random other node in the cluster to exchange gossip endpoint state with. (This step allows for node-failure detection) 3. Probabilistically attempt to gossip with any unreachable nodes (if one exists) 4. Gossip with a seed node if that didn't happen in Step 2. This allows us to discover nodes that have just entered the system and allow it such knows nodes to find out about other ones.The point of gossiping is to allow the nodes in the "ring topology" scheme is to know exactly where all the other nodes are with the purpose of being able to distribute key-value store queries and alleviate excess strain on any one node in the system, thereby ensuring that nothing crashes. This will eventually lead to a consistent hashing scheme as there will always be an available node to process a given query.
Consistency (all nodes see the same data), Availability (every request receives a response indicating a success or failure), and Partition tolerance (the system works even if nodes go down or are added).
a) Document-oriented approach: XML documents are stored in a column in either a BLOB (binary large object) or CLOB (character large object) column. Using CLOB, the documents might be searchable if a full text index is used on the column. Some DMBS's, like Postgres, support an XML column type that make it easier to use XPath expressions on the column. This approach is easy to implement but does not work well with traditional SQL processing and optimization.
b) Data-Oriented approach: the XML document is shredded- its decomposed into its data parts and spread across connected tables. The DBMS handles the mappings between data parts and tables. This approach allows for more precise SQL manipulation/constraint of elements in the stored documents. However, for complicated document structures, manipulation and querying might involve many non-trivial joins.
c) Combined approach (aka partial shredding): Some subtrees of the XML document are shredded (data-oriented approach) while others are stored as BLOBs or CLOBs (document- oriented approach). SQL Views are then used to reconstruct the original document. For example, consider a movie XML document. The movie title, production company, and actors might be stored in a traditional relation that can easily be searched using SQL. The synopsis of the movie, trailer video, photos, etc. might be stored as BLOBs or CLOBs.
Question 8. Give a REST api that might be suitable for a stock quote web site, make sure it has an operation that can return daily stock prices for the last week. Give an example JSON document that might be returned by this operation.
The following example returns the daily stock prices of stock id 7 for the time period starting from 09-28-2020 to 10-04-2020. GET Request : stocks/getQuote?id=7&freq=daily&startTime=09-28-2020&endTime=10-04-2020
Response JSON: { "id" : 7, "name" : "ABC", "quote" : [ { "date" : "09-28-2020", "price" : 166.08 }, { "date" : "09-29-2020", "price" : 163.60 }, { "date" : "09-30-2020", "price" : 165.26 }, { "date" : "10-01-2020", "price" : 167.86 }, { "date" : "10-02-2020", "price" : 168.08 } ] }
Explain how we might store a cache value of such a document using memcached. (Say how to run a memcache server and how such a value could be stored using the memcache protocol.)
Connect to the memcached server using the following command : memcached -l 127.0.0.1 -p 11211 where -l is the location of the server and -p specifies the port to connect to . To store the value of the JSON in a memcached server we use the text protocol. Set the key-value record as : set 7 0 0 10 09-28-2020, 166.08 The command should return STORED if successful.(Edited: 2020-10-05)
Question 3 SAMPLE XML DOCUMENT <Businesscards> <BusinessCard> <Name> <Firstname>Bob</Firstname> <LastName>Kanagavel</Lastname> </Name> <Designation>Product Manager</Designation> <Company>RP Inc</Company> <Phone>7799578901</Phone> <Email>Bob.kanagavel@rpi.com</Email> <Address>RP INC , 4, Washington Square, San Jose, CA</Address> </BusinessCard> </Businesscards>
XPath Expression : Returns all bussiness cards with last name containing 'Kanagavel' and address contains 'San Jose' doc(“businesscards.xml”)//Businesscard/Name[contains(Lastname,‘Kanagavel’)] | //Businesscard[contains(Address,’San Jose’)]
DOM API
The Document Object Model API for XML Processing is a tree based API that represents an XML Document as a tree in internal memory. DOM provides various classes and method for navigating through the tree and to do various operations such as adding, moving, or removing elements from the tree. As the whole document is held in memory at one time, it often uses a lot of space for larger documents.
For Example Consider the following XML Document,
<note> <to>Bob</to> <from>Bill</from> </note>
DOM Would render the document in memory in the following tree structure,
Document—note—to—Bob | from— Bill
SAX API
The Simple API for XML API is an event based XML Processing API that represents an XML document as a stream of events. This stream can be directly passed on to the application which will use an event handler to process the event. So access with SAX is more path-in-tree-like than whole tree-like. Since no tree is built it tends to use less memory than DOM. SAX tends to be good for simple sequential operations on a document or for random access to a part of a document, but it performs worse if you need to do have data manipulation.
SAX would render the XML document from the DOM example as follows,
start document start element: note start element: to text: Bob end element: to start element: from text: Bill end element: from end element: note end document(Edited: 2020-10-05)
Question 2 XML Schema card <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="business_card"> <xs:complexType> <xs:sequence> <xs:element name="company-name" type="xs:string"/> <xs:element name="name" type="xs:string"/> <xs:element name="designation" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="address" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element>
</xs:schema> Sample XML Document <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" ?> <business_card xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://somewhere.com card.xsd"> <name> Bob </name>name> <designation> Developer </designation> <email> bob77777@gmail.com</email> <phone>777777777</phone> <address>dummy_address</address> </business_card>
XSLT card <?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="business_card"> <html> <head> <head> <link rel="stylesheet" type="text/css"/> </head> <title>Cart</title> </head> <body> <h1>Business Card for Company: <xsl:value-of select="/business_card/@company-name"/></h1> <p><xsl:value-of select="/business_card/@name"/></p> <p><xsl:value-of select="/business_card/@designation"/></p> <p><xsl:value-of select="/business_card/@email"/></p> <p><xsl:value-of select="/business_card/@phine"/></p> <p><xsl:value-of select="/business_card/@address"/></p> </body> </html> </xsl:template> </xsl:template>
</xsl:stylesheet>
GROUP : Ishaan, Sida, Parth(Edited: 2020-10-05)