My IT Weblog

Just another WordPress.com weblog
  • Home
  • About
  • Downloads
  • Shared RSS feeds

What is Master Data?

7 02 2008

What is Master Data?

In this podcast, an interesting categorization of data is given by Malcom Chisholm.

In brief, there are six types of data:

  1. Meta Data (e.g. tables names…)
  2. Reference Data (e.g. code tables…)
  3. Enterprise Structure Data (hierarchical enterprise organization of data, e.g. product line…)
  4. Transaction Structure Data (e.g. customer, product data…)
  5. Transaction Activity Data (transactional event data, e.g. orders…)
  6. Transaction Audit Data (states changes in transactional event data, e.g. logs…)

Master data are then the aggregation of reference data, enterprise structure data and transaction structure data. This means that there is not only one category of master data. Thus the meaning of the different types of master data must be taken into account in any good MDM tool.


Comments : Leave a Comment »

Categories : MDM, metadata

Necessary metadata

28 01 2008

In this note, Bill Inmon complains about the endless task of documenting all possible Metadata. Let’s think a bit about this.

If we take the following definition for metadata : data about data, it is clear that documenting all metadata is impossible and endless. Because metadata is also data, metadata can be information about data but can also be information about metadata. And hence the endless loop of documenting the data used for documenting the data…

But, all metadata do not change that much. As an example, see the CWM specification which describes already most of the metadata needed in the data management domain.
Of course, CWM is not exhaustive and cannot be. But maybe, CWM could play the role of the “necessary metadata” searched by Bill Inmon, at least in the domain of data management.

Then data like averages, maximum and all computed data are not really metadata as I understand them. These data are not really a description of data, they are rather data computed from the data. They depend on instances and not on data classes. They are other informations (complementary information) about the data. And I don’t think they should be called metadata. Otherwise everything is a metadata since everything is a data about some other data. Then we could ask ourselves “what is a data?”

Finally, it seems to be obvious that documenting all metadata is impossible. It’s like writing the perfect program: even for a simple program that takes an input and writes it to the output, there is the possibility to write pages of code for handling all the use cases we can think at. Even then, the program will not be able to handle some unexpected cases.


Comments : Leave a Comment »

Categories : metadata


Blogroll

  • jh-net Blog
  • QLog
  • Wolfram

Links

  • Data Quality Pro
  • Eclipse
  • My CiteULike
  • My Del.icio.us
  • TOP on Freshmeat
  • TOP releases
  • WordPress.org

Talend Community

  • Cédric Carbone
  • Olivier Carbone
  • Pierrick Le Gall
  • Planet Talend
  • Stef’s blog
  • Talend Blog
  • Talend on Netvibes
  • TalendForge

Categories

  • BI (1)
  • Computers (2)
  • data profiling (4)
  • data quality (11)
  • ETL (3)
  • general (1)
  • Java (4)
  • linux (1)
  • MDM (1)
  • metadata (2)
  • model (2)
  • quantum (2)
  • regex (2)
  • science (2)
  • SQL (1)
  • Talend (15)
  • Tips&Tricks (13)
  • Uncategorized (2)

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.com

RSS Business Intelligence

  • The Truth About MPP & Data Warehousing
  • Taking the Pain out of Excel Reporting with a useful Talend Extension
  • L’analyse des réseaux sociaux, une méthodologie analytique puissante
  • Statistical Learning for BI, Part 1

RSS Development

  • Quick Dialogs in Linux
  • Top 10 mistakes in Eclipse Plug-in Development
  • JavaBat -- Java Practice Programs
  • Write high performance Java data access applications, Part 3: Data Studio pureQuery API best practices

RSS Data Quality

  • Metrics for Entity Resolution
  • Metrics for Entity Resolution
  • The new face of Data Matching
  • How-to create the Golden Record
  • Confusing streetnames ending in an unfortunate fatality

RSS Talend News

  • Talend Integration Suite RTX
  • AVEC TALEND INTEGRATION SUITE MPX, TALEND LANCE UNE SOLUTION D’INTEGRATION DE DONNEES EXTREMEMENT PUISSANTE, PERMETTANT DE TRAITER D’IMPORTANTS VOLUMES DE DONNEES
  • Yves de Montcheuil, VP of Marketing for Talend
  • Talend Announces Another Year of Record Growth
  • Integrating SAP Data in the Information System Using Open Source Data Integration

Get Opensuse

openSUSE.org

RSS TOP

  • Talend Open Profiler 3.2.1 October 22, 2009
  • Talend Open Profiler 3.1.4 September 25, 2009


Blog at WordPress.com • Freshy theme by Jide.