Skip to main content

Enterprise Data Management (Data Warehouse/ ETL)

 A data warehouse, also known as an enterprise data warehouse (EDW), is a system that stores and analyzes data from multiple sources.

Data warehouses are used for reporting and data analysis, and are considered a core component of business intelligence.

Case study: 

Trends, an affordable fashion & lifestyle retailer which operates more than 2,300 stores in over 1,100 towns and cities across India.

They might have goods stored in different warehouse, They might have enterprise application to track the goods in warehouse and analyze them for improving their business.


A data warehouse has four main components: refer above diagram
  1. Central database:
  2. ETL (extract, transform, load) tools
  3. Metadata (Data sources)
  4. Access tools 
Some examples of data warehouses include:
  1. Snowflake
  2. Google BigQuery
  3. Amazon Redshift
  4. Azure Synapse Analytics
  5. IBM Db2 Warehouse
  6. Firebolt 

Benefits of Data Warehouse:
  1. Understand business trends and make better forecasting decisions.
  2. Data Warehouses are designed to perform well enormous amounts of data.
  3. The structure of data warehouses is more accessible for end-users to navigate, understand, and query.
  4. Queries that would be complex in many normalized databases could be easier to build and maintain in data warehouses.
  5. Data warehousing is an efficient method to manage demand for lots of information from lots of users.
  6. Data warehousing provide the capabilities to analyze a large amount of historical data.

Enterprise Data Management ( EDM ) governance and control:

Enterprise data management (EDM) is the process of defining, integrating, and retrieving data for internal applications and external communication. 

EDM also involves managing the people who access the data, ensuring that they have the right information and follow the organization's standards for storing data. 

EDM focuses on creating accurate, consistent, and transparent content. 


EDM Architecture:

Data Management Maturity Model:


Extract, transform, and load (ETL) is the process of combining data from multiple sources into a large, central repository called a data warehouse. ETL uses a set of business rules to clean and organize raw data and prepare it for storage, data analytics, and machine learning (ML).
Here are some advantages of ETL:
  1. Data governance: ETL fosters data democracy, which broadens data accessibility and makes it available to all the stakeholders to analyze it and use it for business
  2. Reduce delivery time: ETL can reduce delivery time
  3. Reduce unnecessary expenses: ETL can reduce unnecessary expenses
  4. Automate complex processes: ETL can automate complex processes
  5. Validate data before migration: ETL can validate data before migration 
Amazon Redshift:





Comments

Popular posts from this blog

Certified Enterprise Architect Professional (CEAP) - Module 4 - Architecture Precursors

 Architecture Precursors: Precursors to modern Enterprise Architecture (EA) include early frameworks like IBM's Business Systems Planning (BSP), which focused on aligning business strategy with information systems, as well as other Information Systems (IS) architecture methodologies that emerged in the 1970s and 80s, emphasizing the connection between business processes and IT systems, laying the groundwork for the holistic view of an organization that EA represents today; the "Master Plan for Information Systems" by Evans and Hague is also considered a foundational concept in this area. Drivers: internal / external pressure enforce to change the system Aims & Directives: Aims:  Goals Objectives Requirements Directives: Principles (example: Principles can be associated with business, data, applications, infrastructure, or security) Policies (example: Members of the public have minimal access to data) Business Rules (example: A rule directs and restricts a procedure)

Scaled Agile Framework (SAFe)

The Scaled Agile Framework (SAFe) is a set of organizational and workflow patterns for implementing agile practices at an enterprise scale. The framework is a body of knowledge that includes structured guidance on roles and responsibilities, how to plan and manage the work, and values to uphold. Scrum is a simple, flexible approach to adopting Agile that's great for small teams. SAFe is an enterprise-wide Agile framework designed to help bring Agile beyond the team and into the company as a whole. Scaled Agile has built a comprehensive level that includes all the four layers called the team, program, large solutions, and portfolio level. 4 Layers: Portfolio - Strategy, Vision, Roadmap, Strategy goal, Decision making, Budget, Portfolio level metrics,  Program - Align multiple teams towards a common mission, Bring together all the Agile teams, transparency, collaboration, and synchronisation, Scrum of Scrums, Product Owners to define the overall vision. Large Solutions - ar...

4 T's - Technology, Time, Teamwork, Transparency

 1) Technology: Software development technologies are the tools and methods that developers use to design, develop, test, and deploy software applications. These include a wide range of software technologies, such as programming languages, frameworks and libraries, databases, and cloud computing platforms. 2) Time: A timebox is a fixed time period within which a deliverable must be produced in a project management context. It's a time management technique that involves dividing time into individual time periods, each with its own goal, duration, and deadline. Timeboxes are self-contained calendar events that can't be extended once they've started. The fundamental principle of timeboxing is that time in timeboxes can't shift, and once the time runs out, work must stop, even if the task isn't finished.  3) Teamwork: Teamwork in project management is a measure of how well a project's team works together to achieve a goal. It involves collaboration, communication, a...