Main Memory Databases
A main memory database system is a DBMS that primarily relies on main memory for computer
data storage. In contrast, conventional database management systems typically employ hard disk
based persistent storage.
Advantages
The main advantage of MMDBMS over normal DBMS technology is superior performance, as I/O
cost is no more a performance cost factor. With I/O as main optimization focus eliminated, the
architecture of main memory database systems typically aims at optimizing CPU cost and CPU
cache usage, leading to different data layout strategies (avoiding complex tuple representations)
as well as indexing structures (e.g., B-trees with lower-fan-outs with nodes of one or a few CPU
cache lines).
While built on top of volatile storage, most MMDB products offer ACID properties, via the following
mechanisms: (i) Transaction Logging, which records changes to the database in a journal file and
facilitates automatic recovery of an in-memory...
ACID properties of Database
•Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of its
operations are executed or none. There must be no state in a database where a transaction is left partially
completed. States should be defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction.
•Consistency − The database must remain in a consistent state after any transaction. No transaction should have any
adverse effect on the data residing in the database. If the database was in a consistent state before the execution of
a transaction, it must remain consistent after the execution of the transaction as well.
•Durability − The database should be durable enough to hold all its latest updates even if the system fails or restarts.
If a transaction updates a chunk of data in a database and commits, then the database will hold the modified data.
If a transaction commits but the system fails before the data could be written on to the disk, then that data will be
updated once the system springs back into action.
•Isolation − In a database system where more than one transaction are being executed simultaneously and in
parallel, the property of isolation states that all the transactions will be carried out and executed as if it is the only
transaction in the system. No transaction will affect the existence of any other transaction.
Structured Data
What is Structured Data?
Structured data is information that is formatted and stored into a well-defined data model. The
raw data is mapped into predesigned fields that can then be extracted and read through SQL
easily.
SQL relational databases, consisting of tables with rows and columns, are the perfect example of
structured data.
Structured data is more inter-dependent and less flexible.
Semi-Structured Data
Semi-structured data is the data which does not conform to a data model but has some structure. It lacks a fixed or
rigid schema. It is the data that does not reside in a rational database but that has some organizational properties
that make it easier to analyze. With some process, we can store them in the relational database.
Characteristics of semi-structured Data:
•Data does not conform to a data model but has some structure.
•Data can not be stored in the form of rows and columns as in Databases
•Semi-structured data contains tags and elements (Metadata) which is used to group data and describe how the data
is stored
•Similar entities are grouped together and organized in a hierarchy
Evolution of Semi-Structured
Data
The increase in digitization of almost everything we interact with, along with multiple transactions
has resulted in a massive amount of data. The tremendous increase in the speed of digital
information has led the global data to double in very short time intervals. As per Gartner, around
80% of data with organization is unstructured data/semi-structured data which is comprised of
data from emails, social media feeds and customer calls.
This is in addition to information logged by the user devices. It has been increasingly tough to
make proper sense of this unstructured data.
Characteristics
•Data has some structure which, however, does not conform to the structure of a data model.
•A hierarchy is defined wherein all similar entities form a group, and such groups are organised into the hierarchy
of semi structured data examples.
•It is not storable as table columns and rows like data in a relational database.
•The data, which is semi-structured, has metadata/elements and tags to help group it and describe its storage.
•The attributes in any group of items typically are different.
•The group of entities in a group may not or may have the same properties and attributes.
•Semi structured data is hard to manage or automate as its metadata is insufficient and hence cannot be put into a
table with rows & columns.
•Programming such data is difficult as it lacks a sufficient defined structure.
Examples/Sources
•Sources of semi-structured Data:
• E-mails
• XML and other markup languages
• Binary executables
• TCP/IP packets
• Zipped files
• Integration of data from different sources
• Web pages
Advantages
•The data is not constrained by a fixed schema
•Flexible i.e Schema can be easily changed.
•Data is portable
•It is possible to view structured data as semi-structured data
•Its supports users who can not express their need in SQL
•It can deal easily with the heterogeneity of sources
Disadvantages
•Lack of fixed, rigid schema make it difficult in storage of the data
•Interpreting the relationship between data is difficult as there is no separation of the schema
and the data.
•Queries are less efficient as compared to structured data.
Nested Data Types
Nested data types are structured data types for some common data patterns. Nested data types
support structs, arrays, and maps.
A struct is similar to a relational table. It groups object properties together
As a general definition, nested data exists whenever multiple records are sampled from a single
record. The data then consists of two, with header information populated at Level 1 and details or
itemized information populated at Level 2
Nested Data Types – XML
•XML stands for eXtensible Markup Language
•XML is a markup language much like HTML
•XML was designed to store and transport data
•XML was designed to be self-descriptive
•XML is a W3C (World Wide Web Consortium) Recommendation
Difference between HTML and
XML
XML and HTML were designed with different goals:
•XML was designed to carry data - with focus on what data is
•HTML was designed to display data - with focus on how data looks
•XML tags are not predefined like HTML tags are
Advantages of XML
•It simplifies data sharing
•It simplifies data transport
•It simplifies platform changes
•It simplifies data availability
Many computer systems contain data in incompatible formats. Exchanging data between incompatible systems (or upgraded
systems) is a time-consuming task for web developers. Large amounts of data must be converted, and incompatible data is often
lost.
XML stores data in plain text format. This provides a software- and hardware-independent way of storing, transporting, and sharing
data.
XML also makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data.
With XML, data can be available to all kinds of "reading machines" like people, computers, voice machines, news feeds, etc.
Examples of XML data
Nested Data Types – JSON
▪JSON stands for JavaScript Object Notation
▪JSON is a lightweight format for storing and transporting data
▪JSON is often used when data is sent from a server to a web page
▪JSON is "self-describing" and easy to understand
JSON Syntax
•Data is in name/value pairs
•Data is separated by commas
•Curly braces hold objects
•Square brackets hold arrays
The JSON format is syntactically identical to the code for creating JavaScript objects.
Because of this similarity, a JavaScript program can easily convert JSON data into native
JavaScript objects.
JSON Data
JSON Objects
JSON objects are written inside curly braces.
Just like in JavaScript, objects can contain multiple name/value pairs:
'{"name":"John", "age":30, "car":null}’
JSON Arrays
JSON arrays are written inside square brackets.
Just like in JavaScript, an array can contain objects:
"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]
Jason Data Examples
Advantages of JSON
•Less Verbose: JSON has a more compact style than XML, and it is often more readable. The lightweight approach of
JSON can make significant improvements in RESTful APIs working with complex systems.
•Faster: The XML software parsing process can take a long time. One reason for this problem is the DOM
manipulation libraries that require more memory to handle large XML files. JSON uses less data overall, so you
reduce the cost and increase the parsing speed.
•Readable: The JSON structure is straightforward and readable. You have an easier time mapping to domain objects,
no matter what programming language you're working with.
•Structure Matches the Data: JSON uses a map data structure rather than XML's tree. In some situations, key/value
pairs can limit what you can do, but you get a predictable and easy-to-understand data model.
•Objects Align in Code: JSON objects and code objects match, which is beneficial when quickly creating domain
objects in dynamic languages.
Semantic Databases
• Semantics is the study of meaning
• It focuses on the relationship between:
• Signifiers: words, phrases, signs and symbols
• Denotation: what they stand for
• Semantic Database is typically used in conjunction with the Semantic Data Model
• By exposing the semantics of the data, machines can then utilize the information in more interesting
ways than just storing it or displaying it
◦ Semantics are useful for understanding the structure of a "thing", however we need ontologies to relate
things with other things. Therefore, one would expect that a semantic database be relational -- it should
be able to relate structured data into ontologies.
Semantic Databases
In a semantic database (going back to the very early definition of semantics), the schema:
◦ describes denotations
◦ describes relationships between denotations
The job of the database then is to associate signifiers (values) to those denotations. Therefore:
◦ Structure resolves to concrete properties to which instance values can be associated
Semantic Data Model
Semantic data model (SDM) is a high-level semantics-based database description and structuring
formalism (database model) for databases. ... It is a conceptual data model in which semantic
information is included. This means that the model describes the meaning of its instances
A method of organizing data that reflects the basic meaning of data items and the relationships
among them. This organization makes it easier to develop application programs and to maintain
the consistency of data when it is updated.
Comments
Post a Comment