What are the types of databases, please?
Databases can be differentiated by whether they are relational in dimension or not, and also in terms of being operationally oriented or data analysis oriented. MicrosoftSQLServer,IBMDB2,SAPHana,AmazonAurora,AzureSQLDatabase,EnterpriseDB(PostgreSQL),MySQL,MemSQL
Advantages:Mature ecosystem,Transaction assurance/data consistency< /p>
Disadvantages: strict data model definition, database extension limitations, harder to use with unstructured integration.
Analytical-RelationalDatabase:
Typical Application Scenarios: Data Warehouse, Business Intelligence, Data Science
Data Storage Methods: Tables
Mainstream Vendors: OracleExadata,OracleHyperion,Teradata IBMNetezza,IBMdashDB,AmazonRedshift,MicrosoftSQLDataWarehouse,GoogleBigQuery
Advantage:Consistency of information and computation
Disadvantage:Requires specialized IT staff to maintain for the database, and the response time to data is usually minutes. Typically at the minute level
Operational-NonrelationalDatabase:
Typical application scenarios:Web,mobile,andIoTapplications,socialnetworking,userrecommendations, shoppingcarts
Typical application scenarios:Web,mobile,andIoTapplications,socialnetworking,userrecommendations,shoppingcarts,shoppingcarts,shoppingcarts,shoppingcarts,shoppingcarts,shoppingcarts,shoppingcarts shoppingcarts
Data storage: there are many storage structures (document, graph, column, key-value, timeseries)
Mainstream vendors:MongoDB,AmazonDynamoDB,Amazon,DocumentDB AzureCosmosDB,DataStax,Neo4j,Couchbase,MarkLogic,Redis
STRENGTHS:Ease of use,Flexibility (no need for predefined schema),Horizontal scaling (to accommodate large amounts of data),Generally low-cost (open-source)
DISADVANTAGES:Lack of transactional guarantees
Analytical-NonrelationalDatabase:
Typical Application Scenarios:Indexing millions of data points, predictive analytics,fraud detection
Data Storage Methods:Hadoop does not require intrinsic data structures; data can be stored across multiple servers
Main Vendors. Cloudera,Hortonworks,MapR,MarkLogic,Snowflake,DataBricks,ElasticSearch
Advantages:Suitable for batch processing,parallel processing of files;Mainly open source,low investment
Disadvantages:Slow response time;Not suitable for fast lookups or fast update
SQL database are in the form of tables to store data?
Data storage in SQLSERVER database:
One: Storage file types
SQLSERVER has two types of data storage files, data files and log files.
Among them: data files are 8K (= 8192Byte) pages (Page) as the storage unit.
And the log file is to log records as the storage unit. This article only discusses the data file storage method, does not involve the log file storage method.
Data files use pages as the storage unit to store data. To understand how data files are stored, you must understand the types of page types defined in SQLSERVER.
Second: page types
SQLSERVER page types there are eight, specific details of each type, see the following chart:
Third: data page structure
In the data page, the rows of data are placed in order immediately after the top of the page. There is a row offset table at the end of the page. In the row offset table, each row on the page has an entry, and each entry records the distance of the first byte of that row from the top of the page. The sequence of entries in the row offset table is opposite to the sequence of rows in the page. The structure of a data page is shown in the following figure and explained in more detail below
Where: Data page header: 96 bytes that hold system information about the page, such as the type of page, the amount of space available on the page, the object ID of the object that owns the page, and the physical file to which the page belongs.
Data area: The total area corresponding to all the rows of data in the figure above, holds the real data, which is measured in Slots. A Slot corresponds to a row of data records, numbered from 0, saved in reverse hexadecimal order, Slot0, Slot1…..
Line offset array: Used to record the data page in the data page in the relative position of each Slot, easy to locate and retrieve the location of each Slot in the data page, each record in the array accounts for two bytes.
Fourth: storage allocation unit: disk area (extended Extend)
Although SQLSERVER data file storage unit is the page (Page), but the actual SQLSERVE is not for the page for the unit of data allocation space, SQLSERVER default storage allocation unit is the disk area. The main reason for this is to improve performance. In order to avoid frequent read and write IO, in the table or other objects to allocate storage space, not directly allocated a 8K page, but a disk area (Extend) for the storage allocation unit, a disk area for 8 pages (= 8 * 8K = 64K).
But this reduces frequent IO reads and writes, improves database performance, but leads to a new problem, that is, in the storage of those who have only a small amount of data, less than 8K objects, if it is also allocated to a disk area, there will be a waste of storage space, reducing the efficiency of space allocation.
In order to solve these problems, SQLSERVER provides a solution that defines two types of disks, unified disks and mixed disks.
Which: Unified partition can only store the same object, the object owns all the pages of the partition
Mixed partition: by multiple objects jointly owned the partition.
In the actual allocation of storage disk area for the object, in order to improve space utilization, by default, if the size of an object at the beginning of the size of less than 8 pages, as far as possible in the mixed disk area, if the size of the object to increase to 8 pages, SQLSERVER will be reallocated to a unified disk area for this object.
In order to be able to allocate storage disks to objects through the above strategies, SQLSERVER provides a GAM/SGAM mechanism to manage and maintain disk information for data files.
What are the two main types of databases?
Databases are mainly categorized into relational databases and non-relational databases (NoSQL).
1. Relational databases
Relational databases, stored in a format that visually reflects the relationships between entities. Relational databases and common tables are more similar, relational databases in the table with the table is a lot of complex correlation between the relationship.
Common relational databases include Mysql, SqlServer, and so on. In light or small applications, the use of different relational databases does not have a significant impact on the performance of the system, but when building large applications, you need to choose the appropriate relational database based on the business needs and performance requirements of the application.
2, non-relational database (NoSQL)
This refers to the distributed, non-relational, not guaranteed to follow the ACID principle of data storage system. NoSQL database technology and CAP theory, consistency hash algorithms have a close relationship. NoSQL database is suitable for the pursuit of speed and scalability, the business of the application of the changeable scene.