登陆注册
27082000000029

第29章 Database System(8)

Data Warehousing

Data warehouses contain consolidated data from many sources?? spanning long time periods?? and augmented with summary information. Warehouses are much larger than other kinds of databases; sizes ranging from several gigabytes to terabytes are common. Typical workloads involve ad hoc?? fairly complex queries?? and fast response time is important. These characteristics differentiate warehouse applications from OLTP applications?? and different DBMS design and implementation techniques must be used to achieve satisfactory results. Adistributed DBMS with good scalability and high availability ??achieved by storing tables redundantly at more than one site?? is required for very large warehouses.

An organization's daily operations access and modify operational databases. Data from these operational databases and other external sources ??e. g.?? customer profiles supplied by external consultants?? are extracted by using gateways?? or standard external interfaces supported by the underlying DBMS. Standards such as Open Database Connectivity ??ODBC?? from Microsoft are emerging for gateways;ODBC is an application program interface that allows client programs to generate SQL statements to be executed at a sewer.

There are many challenges in creating and maintaining a large data warehouse. A goad database schema must be designed to hold an integrated collection of data copied from diverse sources. For example?? a company warehouse might include the Inventory and Personnel departments' databases?? together with Sales databases maintained by offices in different countries. Since the source databases are often created and maintained by different groups?? there are a number of semantic mismatches across these databases?? such as different currency units?? different names for the same attribute?? and differences in how tables are normalized or structured;these differences must be reconciled when data is brought into the warehouse. After the warehouse schema is designed?? the warehouse must be populated?? and over time?? it must be kept consistent with the primary data sources.

Data extracted from operational databases and external sources is first cleaned to minimize errors and fin in missing information when possible?? and transformed to reconcile semantic mismatches. Transforming data is typically accomplished by defining a relational view over the tables in the data sources ??the operational databases and other external sources??. Loading data consists of materializing such views and storing them in the warehouse. Unlike a standard view in a relational DBMS?? therefore?? the view is stored in a database ??the warehouse?? that is different from the database ??s?? containing the tables it is defined over.

The cleaned and transformed data is finally loaded into the warehouse?? Additional preprocessing such as sorting and generation of summary information is carried out at this stage. Data is partitioned and indexes are built for efficiency. The large volume of data to be loaded means that loading is a slow process; loading a terabyte of data sequentially can take weeks. Parallelism is therefore important for loading warehouses.

同类推荐
  • 双语学习丛书-体坛明星

    双语学习丛书-体坛明星

    流淌心灵的隽语,记忆的箴言,在双语的世界里,感受英语学习的乐趣,体会英语学习的奥妙,提高英语学习的能力!丛书包括:爱的交融、财富精英、成长课堂、醇香母爱、父爱如山、故乡情怀、节日趣闻、快乐心语、浪漫往昔等故事。
  • 英语美文口袋书:文化篇

    英语美文口袋书:文化篇

    本套书共设计五本,选取英语国家美文,以欣赏性美文为基础,兼顾时效性和趣味性。内容涉及生活感悟、情感、美德与修养、自然、世界文化等主题,体裁不拘一格,以散文、随笔、故事等形式呈现。体例上,除提供英文和译文外,增加了内容导读、单词解释和文字赏析,便于读者在了解内容同时,达到赏析和学习语言的目的。本书为文化篇。
  • 英语PARTY——唐诗双声道

    英语PARTY——唐诗双声道

    本套书籍带你领略英语世界风景,感悟英语学习氛围,有助于英语学习。
  • CatholicchurchinChina

    CatholicchurchinChina

    CatholicismisforeigntoChina.Therearemanydifferentopin-ionsonwhenCatholicismwasfirstintroducedintoChinaintheworldofacademia.
  • 有些路,只能一个人走

    有些路,只能一个人走

    本书精选了众多励志小故事,让我们在这些美文中开始一段心灵的旅行,去发现最好的自己,屏弃内心的杂念,给灵魂喘息的机会。人生变幻,目前的困境只是暂时的,偶尔一两次的失败并不代表什么,挺起胸膛,还有更长的路要走。打破禁锢,开放自我,充分相信自己的未来充满阳光,不断超越自己,发现最好的自己,踏上更高、更广阔的舞台!
热门推荐
  • 洪荒之计都魔君

    洪荒之计都魔君

    洪荒群星,以八万四千周天星君为尊,以九曜星君为首。九曜者,金木水火土,太阴,太阳,计都和罗喉;对应五行阴阳吞噬毁灭和杀伐十大大道。金木水火土于封神之后孕育五德星君,为后世道教最高神灵之一;太阴星于洪荒初年孕育羲和和常羲两位先天大神,孕育先天灵根月桂树;太阳星之上有东皇太一抱混沌钟出世,有帝俊怀抱河图洛书出世,演绎无数神话传说;罗喉星君得三亿六千万魔神精血出世于神魔初劫,大战鸿钧,威势无双。后世传说,唯独缺少计都星君,当后世宅男穿越来到洪荒世界,和计都星君融合之时,且看其如何纵横八荒!
  • 都市最强主上

    都市最强主上

    相传,在古老的森罗大森林里,有着这样的一群人,他们有着同样的名字,他们供奉着同一个人
  • 游戏开发物语

    游戏开发物语

    明明只是一大段不显眼的代码还有某些无聊时画的画随随便便用计算器算出来的数值一大段不知所云的文字再加上一些不知道是啥的东西就能够构成一个令人感到惊讶的东西这个东西的名字叫做——游戏
  • 我真不是算命的

    我真不是算命的

    这位大姐,你三日之内必定有血光之灾,想要破的话,就在子时三刻,前来寻我化解!一个不是算命的神棍混迹在一个光怪陆离的平行世界的故事!
  • 天行

    天行

    号称“北辰骑神”的天才玩家以自创的“牧马冲锋流”战术击败了国服第一弓手北冥雪,被誉为天纵战榜第一骑士的他,却受到小人排挤,最终离开了效力已久的银狐俱乐部。是沉沦,还是再次崛起?恰逢其时,月恒集团第四款游戏“天行”正式上线,虚拟世界再起风云!
  • 天行

    天行

    号称“北辰骑神”的天才玩家以自创的“牧马冲锋流”战术击败了国服第一弓手北冥雪,被誉为天纵战榜第一骑士的他,却受到小人排挤,最终离开了效力已久的银狐俱乐部。是沉沦,还是再次崛起?恰逢其时,月恒集团第四款游戏“天行”正式上线,虚拟世界再起风云!
  • 寻梦:四驱的奔驰

    寻梦:四驱的奔驰

    第三届世界杯大赛马上又要开幕了。这距第一届世界杯大赛结束的第六年。六年的时间让曾经年少不懂事的他们成熟了很多。第三届世界杯还有很多有意思的事情。新的国家加入到世界杯,中国队成员大换血……
  • 魔王傲视苍生

    魔王傲视苍生

    人欺吾杀人,天吾逆天;一日不强,终被欺,化身魔王,笑九天。
  • 乾坤百态终成风

    乾坤百态终成风

    谁人懂乾坤万物之道,万物苍生六道轮回,百态万物苍生从何处来何处去,且看从远古至今与你细谈天道!
  • 真心的友情

    真心的友情

    这是本人写的第一本书,本书记载着我学生学习真正情况!?