摘要:DatabasesgetagriponXMLIfyoucoulddoonethingtoimproveintegrationandautomateprocesseswithcustomersandbusinesspartners,itwouldbetoimplementXML,whichhasbecomethestandardforexchanginginformationbetweendisparatesystemsbecauseitiseasilytransformedintoanyformat
Databases get a grip on XML
If you could do one thing to improve integration and automate processes with customers and business partners, it would be to implement XML, which has become the standard for exchanging information between disparate systems because it is easily transformed into any format. The good news is that the four leading relational databases, namely Oracle Database, IBM DB2, Sybase ASE, and Microsoft SQL Server, not only can store XML data, but they hide much of the complexity of working with XML.
What does a fashionable XML database provide? Four basic functions: the ability to consume, store, search, and generate XML. The extent to which the database supports these functions and the methods it uses to accomplish them are what make for a successful implementation of XML in a database.
Relational databases and XML documents are both powerful ways to represent relationships among data, but they're powerful in different ways. For example, querying on a patient ID number in a relational database may allow you to quickly find the dates a certain patient visited the hospital, the conditions he was diagnosed with, and the treatments he was given. But it likely won't help you determine which treatments were provided for which conditions or what times the treatments took place, nor will it give you other useful information that XML versions of these records could provide.
But whether or not you can combine the benefits of relational and XML data depends on how you store the XML. There are three methods for physically storing XML data in a relational database: shredded, unstructured, and structured. Shredded and unstructured are useful methods but limited. The structured method allows you to leverage the power of both relational data and XML hierarchies.
Shredding puts XML data into relational columns but strips it of its XMLness, meaning the hierarchical relationships among the data in the original XML document are lost. Shredding is useful when you're not concerned about keeping the data in XML format. For example, let's say you have a Web site that allows customers to place orders, and the order needs to go to a number of different database systems. Producing an XML file and having the different systems pick it up——that is, shred it——from a network share may be the most efficient and error-free way to get the data where you want it to go.
The unstructured method uses a data type called a CLOB (Character Large Object) to store an entire XML document as a single unit. Databases have been doing this for years with different types of documents, so this is nothing new. The unstructured method provides limited search capabilities, but it is still quite useful. You can't base queries on it, but the structure of the original data is preserved. A good use for unstructured XML storage would be in keeping original documents to comply with government regulations. For example, if a financial institution were to receive original loan documents in XML, this would allow them to have a relational record of each loan application, and also to store the original application with that record.
The structured method allows you to store XML data inside the database and preserve the hierarchy of the data. Structured storage, also known as “native XML”storage, is what every vendor is trying to achieve. The most obvious benefit of preserving the hierarchical relationships of XML data is being able to receive an XML document, combine it or manipulate it with relational data, and produce XML as a result. It isn't possible to produce such result sets with a relational query language alone.
時文選讀
數(shù)據(jù)庫抓住XML
如果你能做件事來改進(應用)集成、實現(xiàn)與客戶和商務伙伴合作的自動化,那它就是實施XML,XML已經(jīng)成為不同系統(tǒng)之間交換信息的標準,因為它很容易轉(zhuǎn)換成任何格式。令人高興的是,四個主要的關系數(shù)據(jù)庫系統(tǒng),即Oracle、IBM DB2、Sybase ASE和Microsoft SQL Server,不僅能儲存XML數(shù)據(jù),而且還隱藏掉使用XML時的很多復雜性。
那么時髦的XML數(shù)據(jù)庫能提供哪些功能?有四項基本功能:消費、儲存、搜索和生成XML。在多大的程度上支持這些功能和實現(xiàn)這些功能所使用的方法,成為在數(shù)據(jù)庫中實施XML的關鍵。
關系數(shù)據(jù)庫和XML文檔都是表示數(shù)據(jù)之間關系的重要方法,但是它們的重要性表現(xiàn)在不同的地方。例如,查詢關系數(shù)據(jù)庫中病人的身份證號碼,可以讓你快速發(fā)現(xiàn)某個病人到醫(yī)院看病的日期、診斷的病情和接受的治療。但此數(shù)據(jù)庫不可能幫你確定對哪種病情提供了哪種治療、或者治療了多少次,也不能提供其他的有用信息,而這些病歷的XML版本能提供這些信息。
你能否合并關系數(shù)據(jù)和XML數(shù)據(jù)兩者的長處,依賴于你如何存儲數(shù)據(jù)。在關系數(shù)據(jù)庫中物理地存儲XML數(shù)據(jù)有三種方法:切碎、非結(jié)構(gòu)化和結(jié)構(gòu)化。切碎和非結(jié)構(gòu)化是有用的方法,但有局限性。而結(jié)構(gòu)化方法讓你可以利用關系數(shù)據(jù)和XML層次結(jié)構(gòu)兩者的力量。
切碎是將XML數(shù)據(jù)放進關系列中,但去掉了它的XML特征,這意味著原來XML文檔中數(shù)據(jù)的層次關系丟失了。當你對是否按XML格式保存數(shù)據(jù)無所謂時,切碎法是有用的。例如,你有一網(wǎng)站,允許客戶下訂單,訂單要經(jīng)過多個不同的數(shù)據(jù)庫系統(tǒng)。產(chǎn)生一個XML文件,從一個共享網(wǎng)絡上讓不同的系統(tǒng)選取——就是將它切碎,這可能就是最有效、沒有錯誤的方法,讓數(shù)據(jù)到你想讓它到的地方去。
非結(jié)構(gòu)化方法使用了一個叫CLOB(字符型大對象)的數(shù)據(jù)類型,將整個XML文檔作為單個單元存儲起來。數(shù)據(jù)庫利用此方法處理不同類型的文檔已有多年了,因此它不是新東西。非結(jié)構(gòu)化方法提供了有限的搜索功能,但它還是很好用。你不能將查詢建在此基礎上,但初始數(shù)據(jù)的結(jié)構(gòu)保留了下來。非結(jié)構(gòu)化存儲的一項很好的用途,就是保存原始文檔,使之符合政府規(guī)章。比如,一家金融機構(gòu)計劃接受采用XML格式的原始貸款文檔,那么這就允許他們對每個貸款申請都有一個關系紀錄,同時保存著有此紀錄的原始申請。
結(jié)構(gòu)化方法允許你在數(shù)據(jù)庫中存儲XML數(shù)據(jù)和保留數(shù)據(jù)的層次關系。結(jié)構(gòu)化存儲也叫“原始XML”儲存,所有的供應商都想實現(xiàn)它。保留XML數(shù)據(jù)的層次關系最明顯的好處,就是能接收XML文檔,將它與關系數(shù)據(jù)合并或進行操作,并最終產(chǎn)生XML。單獨用關系查詢語言是不能產(chǎn)生這樣的結(jié)果的。
軟考備考資料免費領取
去領取