Data engineering
This blog contains some commonly use terms in world of data engineering.
Types of data source
- Structured data source: Data organized as tables of rows and columns.
- Semi-structured data source: Data that is not in tabular form but still have some structure. Ex - JSON, XML
- Unstructured data source: Data that does not have any pre-defined structure. Ex -text, video, audio, images, etc
Types of source system
- Databases: Store data in an organized way, structured or semi-structured
- Files: Sequence of bytes representing information TXT, png, mp3, csv etc
- Streaming system - Continuous flow of data, semi structured data. Eg- IOT sensor
ACID properties
- Atomicity: It ensures that transactions are treated as single individual unit.
- Consistency:Any changes to the data made within a transaction follow the set of rules or constraints defined by database schema.
- Isolation: Each transaction is executed in sequential order.
- Durability: Once a transaction is completed, its effects are permanent and will survive subsequent system failures.
To be continued..