Database Principles and Applications Key Knowledge Summary

What is the basic knowledge of the principle of database

Basic knowledge of the principle of database:

The accumulation of terms: database (Database): the storage and provision of data in the “Treasury”

Data (Data): the database of the basic objects stored.

Database Management System (DBMS): a layer of data management software located between the user and the operating system.

Database System (DatabaseSystem): Includes the database, DBMS, application system, database administrator (DBA)

Primary Key (PrimaryKey): Attribute or set of attributes used to uniquely identify a record in a table.

ForeignKey: Used to associate with another table, the foreign key is the primary key of the other table

SuperKey: A super key is an attribute or collection of attributes that uniquely distinguishes a tuple

Key (CandidateKey): Removes the redundant attributes from the super key, and still ensures that it can distinguish between different tuples.

Schema: a description of a database, including its structure, data types, and constraints.

Instance/State: the actual data stored in the database at a given time. (Instance is the materialization, instantiation of Schema at a certain moment)

Data Manipulation Language (DML: DataManipulationLanguage): add, delete, change, and check

Data Definition Language (DDL: DataDefinitionLanguage): definition, deletion, and modification. objects in the database

Data Control Language (DCL: DataControlLanguage): used to control the user’s authority to manipulate the database

DataModel (DataModel): an abstraction of the characteristics of real-world data, used to define how the data is organized and how the data is related to each other

UnionCompatibility. UnionCompatibility: Two relationships are compatible if they have the same number of attributes and the same domain for each attribute

VIEW: A view is a virtual table that does not physically store data. Rather, it is data derived from a base table or other view. An update to a view effectively translates into an update to the actual base table.

DataModel: basic concept: an abstraction of the characteristics of real-world data, used to define how data is organized and how it relates to each other.

Hierarchy:

1. Conceptual Model (Conceptual): modeling data and information from the user’s point of view

2. Logical/Implementation Model (Logical/Implementation): Hierarchical, Mesh, and Relational Models

3. Physical Model (Physical): The way data is physically stored in a specific DBMS product

Three levels of schema structure of a database system:

1. InternalSchema: (also known as storage schema.) Description of the physical structure and storage of data, is the way the data is represented inside the database

2. Conceptual Schema (ConceptualSchema): (also known as the global schema.) Sometimes referred to as “schema”. Is a description of the logical structure and characteristics of all the data in the database

3. External Schemas (ExternalSchemas): (also known as sub-schemas or user schemas.) Description of the logical structure and characteristics of the local data that can be seen and used by the database user

Two levels of mapping and physical and logical independence of the database system:

Two levels of mapping:

1. Conceptual schema/InternalSchema mapping

2. ExternalSchema/ConceptualSchema mapping

Physical independence of data:

The mapping between InternalSchema and ConceptualSchema provides the physical independence of data. The mapping between provides physical independence of data. When the physical structure of the data changes, only the mapping between the inner schema and the conceptual schema needs to be modified.

Logical Independence of Data:

The mapping between the conceptual schema and the outer schema provides logical independence of the data. When the overall logical structure of the data changes, only the mapping between each outer schema and the conceptual schema needs to be modified to ensure that the application is not affected.

Data constraints: integrity constraints 1. domain constraints: constraints on the range of values of attributes

2. key constraints: each relationship must have a primary key, and each primary key must be different

3. non-null constraints: attribute values can not be NULL

4. entity integrity constraints: the value of the primary key can not be null

5. reference Integrity constraints: a foreign key can take a NULL value, but it cannot be NULL if the foreign key is the primary key of another relationship.

6. User-defined integrity

Integrity constraints that may be violated by various data operations

Insertion operations: domain constraints, key constraints, non-null constraints, entity integrity constraints, referential integrity constraints

Deletion operations: Referential Integrity Constraints

Update operations: domain constraints, key constraints, non-null constraints, entity integrity constraints, referential integrity constraints

The order in which SQL statements are executed: 1. FROM clause assembles data from different data sources

2. WHERE clause filters the records based on the specified conditions

3. GROUPBY Clause divides data into groups

4. Calculates using aggregation function

5. Filters groups using HAVING clause

6. Calculates all expressions

7. Sorts result set using ORDERBY

ControlledRendancy versus UncontrolledRendancy. ControlledRendancy: Uncontrolled redundancy of data stores can lead to the following problems:

1. Duplication of work when updating data

2. Waste of space

3. Data may be inconsistent

So ideally we should design a redundancy-free database, but sometimes we need to improve the efficiency of the query, so we introduce ControlledRendancy

For example:

We store student name and course number redundantly in the GRADE_REPORT table, because when querying for grades we need to query for the student’s name as well as the course number at the same time.

The difference between a relation and a file or table: a relation looks like a two-dimensional table

The domain of a relation (the range of values of an attribute) is a set of atomic values (indivisible values)

The tuples of a relation must be distinct

Relational algebra: the five basic operations: concatenation, difference, Cartesian product, choice, and selection. Cartesian product, selection, projection

Relational algebra interpreter: relational algebra interpreter (simulates relational algebra)

Types of inner joins:

1. Equivalent joins

2. Inequivalent joins

3. Natural joins

SQL statement: copy of the table structure (does not include the relationship between tables)

SELECT*INTOCOPY_DEPARTMENTFROMDEPARTMENTWHERE1=0;

Three-valued predicate logic: 1.TRUE

2.FALSE

3.UNKNOWN

Only in the comparison of the results of TRUE is determined to be true, e.g.. (TRUE and UNKNOWN intersect as UNKNOWN, this tuple will not appear in the result)

Basic Processes of Database Application Design: PhasesofDatabaseDesignandImplementationProcess(Basic Processes of Database Design)

Phase1. RequirementsCollectionsandAnalysis(RequirementsCollectionandAnalysis)

Phase2:ConceptualDatabaseDesign(ConceptualStructuralDesign)

Phase3:ChoiceofaDBMS(SelectionofAppropriateDBMS)

Phase4:DataModelMapping(LogicalDatabaseDesign)

Phase5:PhysicalDatabaseDesign

Phase6. DatabaseSystemImplementation

Phase7:DatabaseSystemOperationandMaintenance

ER Diagram Symbol Explanation:

Mapping ER Model to Logical Model Steps: 1. Mapping Strong Entity Types

2. Mapping Weak Entity Types

3. Mapping 1:1 Binary Link Types

4. Mapping 1:N Binary Link Types

5. Mapping M:N Binary Link Types

6. Mapping Multi-Valued Attributes

7. Mapping N Meta-Links

Database Paradigm:

1NF (FirstNormalForm): Entity E is said to satisfy the first paradigm when and only when all domains contain only atomic values, i.e., each component is a non-redivisible data item

2NF (SecondNormalForm): Entity E satisfies the second paradigm when and only when it satisfies the first paradigm and each non-key attribute is completely dependent on the primary key

3NF (ThirdNormalForm): satisfies the third paradigm when and only when entity E is SecondNormalForm and no non-primary attribute in E passes dependency

Principles of Databases: An Overview of SQL

3.1.1 History of SQL

Assessment requirements: Achievement of “Literacy”

Hierarchical Knowledge Points: History of SQL

SQL : A structured query language that, despite its name, actually has multiple functions such as definition, querying, updating, and control.

3.1.2 Architecture of a SQL database

Assessment requirements: “understanding”

Hierarchical knowledge: understanding of the three-tiered structure

The architecture of a SQL database is also three-tiered. architecture is also a three-tier structure, but the terminology is different from the traditional relational model terminology. In SQL, the relational schema is called the “base table”, the storage schema is called the “storage file”, the sub-schema is called the “view”, and the tuple is called the “view”. In SQL, the relational schema is called the “base table”, the storage schema is called the “storage file”, the sub-schema is called the “view”, the tuples are called “rows”, and the attributes are called “columns”.

The key points of the structure of a SQL database system are as follows:

(1) A SQL database is a collection of tables.

(2) An SQL table consists of a set of rows, which are sequences of columns, each corresponding to a data item.

(3) A table is either a base table or a view. The basic table is the actual table stored in the database, the view by is the definition of a table consisting of a number of basic tables or other views.

(4) A basic table can span one or more storage files, and a storage file can also hold one or more basic tables. Storage files correspond to physical files.

(5) Users can manipulate tables, including views and basic tables, with SQL statements.

(6) The user of SQL can be an application or an end user.

3.1.3 Components of SQL

Assessment requirements: to achieve “literacy”

Hierarchical Knowledge Points: Four Components

SQL consists of four components:

SQL is composed of. >

(1) Data Definition: SQLDDL. defines the SQL schema, basic tables, views and indexes.

(2) Data Manipulation: SQLDML. including data query and data update (add, delete, change).

(3) data control: including authorization of basic tables and views, description of integrity rules, transaction control.

(4) Provisions for the use of embedded SQL.

Introduction to Database Principles and Applications

The book takes relational database systems as the core, and systematically and comprehensively describes the basic concepts, fundamental principles, and application techniques of database systems, with the main contents including an overview of database technology, relational databases, SQL, the standard language for relational databases, relational database design, database protection, network databases, network database management systems SQLServer2000, distributed database systems, XML databases and so on.

This book has clear concepts, outstanding focus, reasonable chapter arrangement, and close integration of theory and practice. Each chapter with a wealth of exercises, cases and experiments to help readers deepen their understanding of the content, master and consolidate concepts; cases provide readers with real database application scenarios, helping readers from the perspective of the actual application of the theoretical connection between the content of what they have learned; and experiments to provide readers with a combination of theory and practice of the specific operation of the way, and ultimately consolidate the content of what they have learned. The design of exercises, cases and experiments is also a more prominent feature of this book.

This book can be used as a teaching book for undergraduate computer science majors (information technology direction), information management and information systems majors and related professional database courses, and as a self-study reference book for scientific and technological personnel working in the field of information.

Principles of Databases Chapter 2 Short Answer Summary

Chapter 2 Relational Models

19. Definitions of Superkey, Primary Key, and Candidate Key:

Superkey: the set of attributes that identifies a tuple in a relationship is known as the superkey of the relational schema.

Candidatekey: A superkey that does not contain redundant attributes is called a candidate key. (There can be more than one candidate key.)

primarykey: A candidate key selected by the user to identify a tuple is called a primary key. (Primary key is one of the candidate keys)

20. Relational Schema, Relational Sub-schema and Storage Schema:

The relational model basically follows the three-tier architecture of the database. The conceptual schema is a collection of relational schemas, the outer schema is a collection of relational sub-schemas, and the inner schema is a collection of storage schemas.

(1) Relational Schema: The relational schema is actually the record type. Its definition includes: schema name, attribute name, value domain name and primary key of the schema.

(2) Relational sub-schema: It is a description of that part of the data used by the user. In addition to pointing out the user’s data, it should also point out the correspondence between the schema and the sub-schema.

(3) Storage schema: the basic organization when storing a relation is a file, and tuples are records in a file. Storing a relation can be realized by hashing method or indexing method. If the number of tuples in the relationship is small, it can also be realized by heap file method.

21. Three types of integrity rules for the relational model:

(1) Entity Integrity Rule: This rule requires that a tuple in a relationship must not have null values on the attributes that make up the primary key.

(2) referential integrity rules: this rule requires that “no reference to non-existent entities”.

(3) user-defined integrity rules: it reflects the semantic requirements that must be met by the data involved in a specific application.

22. Formal Definition of Referential Integrity Rule:

If the attribute set K is the primary key of the relational schema R1, and K is also the foreign key of the relational schema R2, then only two possibilities are allowed to take the value of K in the relationship of R2, either it is null or it is equal to one of the primary key values in the relationship of R1.

There are three more points to note when using this rule:

(1) The foreign key and the corresponding primary key can have different names, as long as they are defined on the same value field.

(2) R1 and R2 can also be the same relational schema, indicating a link between attributes.

(3) Whether or not the foreign key value is allowed to be empty should depend on the specific issue.

In the above formal definition, the relational schema R1 is called the “referential” schema and R2 is called the “dependent” schema.

23. Formal definition of relational model:

24. What are the two types of relational query languages according to their theoretical foundations:

Relational Algebraic Languages: DML languages in which the query operations are based on set operations. (weakly non-procedural)

Relational Algorithmic Languages: DML languages in which query operations are predicate calculus-based operations. (Strongly non-procedural)

25. What are the operations in relational algebra?

Operations in relational algebra can be divided into two categories:

Traditional set operations: merge, difference, intersection, Cartesian product

Expanded set operations: vertical partitioning of relations (projection), horizontal partitioning (selection), union of relations (join, natural join), inverse of the Cartesian product (division), and so on.

Five of the basic operations are: merge, difference, Cartesian product, projection, and selection.

The four common combinatorial operations are: intersection, conjunction, natural join, division

The two expanded relational algebra operations are: outer join and outer merge

26. What are the two kinds of relational algorithms:

Relational algorithms can be divided into tuple relational algorithms and domain relational algorithms. The former uses tuples as variables and the latter uses attributes (domains) as variables.

27. What are constrained variables, free variables:

28. What are safe operations:

In database technology, operations that do not produce infinite relations and infinite validation are called safe operations, the corresponding expressions are called safe expressions, and the measures taken are called safety constraints.

In relational algorithms it is agreed that the operation operates only on the formulas in the expression within the range of values of the relations involved. This prevents infinite relations and infinite validation problems, and the relational algorithm is safe.

29. Why to optimize relational algebraic expressions:

Query optimization is the combination of relational algebraic expressions optimized by the DBMS to improve the system efficiency of the DBMS. The reason to optimize the relational algebra is that since relational algebra expressions are combined by relational algebra operations. Among the relational algebra operations, the execution of Cartesian product and join operations is the most time consuming and a large number of intermediate results will be generated during the execution to make the system execution less efficient. Before execution, the relational algebra expression is optimized by the DBMS query processing subsystem to execute the selection and projection operations as early as possible to obtain smaller intermediate relations, reduce the amount of operations and the number of times to read the external memory block, save the execution time of the system, and improve the execution efficiency.

30. Briefly describe the optimization strategy of query optimization:

(1) Perform selection operations as early as possible in the relational algebraic expression.

(2) Combine the Cartesian product and subsequent selection operations into an F-join operation.

(3) Calculating a sequence of selection and projection operations at the same time, so that separate operations do not result in multiple scans of the file, can save operation time.

(4) If a sub-expression occurs several times in an expression, the sub-expression should be pre-calculated to save the result. To avoid repeated calculations.

(5) Preprocess the relationship file appropriately.

(6) Before calculating an expression, you should estimate how to calculate it.

31. What is the difference between Cartesian product, equal connection and natural connection:

Equal connection has Cartesian product operation;

Natural connection is a kind of equal connection, which is the result of equal connection of all the common attributes of two relations.

Introduction to Database Principles and Applications

Database Principles and Applications is a companion textbook for the Shanghai Fine Arts Program “Database Principles and Applications”.

“Database Principles and Applications” is a more systematic and comprehensive description of the basic theory of database systems, basic technology and basic methods, a total of 11 chapters and two appendices, the specific content mainly includes the basic concepts of the database, data model, relational database, relational database, relational database standard language sql, triggers, stored procedures, data integrity, database security, relational database theory, indexing, database design, transaction management, concurrency control, and so on. database design, transaction management, concurrency control, database backup and recovery, data warehousing, data mining and new database technologies, the use of sqlserver2005, experimental guidance.

The examples related to sql statements in the book are all tested in the sqlserver2005 environment.

This textbook is accompanied by the experimental guidance (Appendix b) is the author’s many years of database experimental teaching accumulation, to sqlserver as the experimental environment, rich and comprehensive content, very practical.

“Database Principles and Applications” can be used as both higher education institutions of computers, software engineering, information security, information management and information systems, information and computing science and other related professional undergraduate database course materials, but also as electrical engineering-related graduate database courses and power enterprise information technology teaching materials.