XMLNews-Story Tutorial |
Copyright (c) 1999 by XMLNews.org. Free redistribution permitted.
XMLNews-Story is an XML document type for text-based news and information; XMLNews-Story defines the format of a news story's content, while the associated XMLNews-Meta specification defines the format of metadata associated with a story or any other kind of news object. For more information on how the different formats work together, see the XMLNews Technical Overview.
This tutorial will help you learn about the basic structure of XMLNews-Story documents. Before you read the tutorial, you should be familiar with the basics of XML syntax and structure; if you do not have previous experience with XML, please read the document on XML Basics first.
The tutorial provides a general technical introduction to XMLNews-Story, but does not cover all of the details comprehensively. For more complete information, please refer to the XMLNews-Story Specification and to the XMLNews-Story Document Type Definition. In case of disagreement, the specification is authoritative.
This section introduces you to the XMLNews-Story document type by building a simple XMLNews-Story document, one piece at a time.
It is always a good idea to begin an XML document with an XML declaration that states the version of XML used (currently, the only version is 1.0) and, optionally, the character encoding of the document:
<?xml version="1.0"?>
For more information, see the description of the XML Declaration in the XML Basics introduction.
Now that there's an XML declaration, you can begin adding the structure for your story. Every XML document contains exactly one root element: all other elements must begin and end inside it. For XMLNews-Story documents, the root element is always nitf, so the news story must begin with an “<nitf>” start tag and end with an “</nitf>” end tag:
<?xml version="1.0"?> <nitf> </nitf>
You will insert all of the other text and markup for the story between these two tags.
Within the nitf element there are always two subelements, head and body (these are both taken from HTML), so you can add the start and end tags for these two divisions to the news story:
<?xml version="1.0"?> <nitf> <head> </head> <body> </body> </nitf>
The story still does not contain any text, but the structure is starting to take shape. The head element will contain simple meta information for the story (in this tutorial, just a plain-text title); the body element will contain the main text of the story.
Now that the start tags for the major head and body elements are in place, you can fill in the contents of the head element. In this example, head contains only one subelement, title, which contains a plain-text title for the document:
<?xml version="1.0"?> <nitf> <head> <title>Colombia Earthquake</title> </head> <body> </body> </nitf>
Note that the title may or may not be the same as the headline: the title element contains the text that should appear in a listing of stories, a pull-down menu, the title bar of a window, or anywhere else that a short, textual title is appropriate.
Unlike in HTML, the body element of an XMLNews-Story document is further subdivided into the body.head element, which contains the story's frontmatter (such as the headline, byline, and dateline) and the body.content element, which contains the main text of the story.
To add a headline to the story, you first create the body.head element within body; next, you add the hedline element within body.head. The hedline element contains one hl1 element (the main headline) and zero or more hl2 elements (subheadlines).
In the following example, there is only the main headline:
<?xml version="1.0"?> <nitf> <head> <title>Colombia Earthquake</title> </head> <body> <body.head> <hedline> <hl1>143 Dead in Colombia Earthquake</hl1> </hedline> </body.head> </body> </nitf>
In addition to the headline, the body.head element can also contain one or more bylines. To add a byline, you first insert a byline element, then insert the actual text of the byline inside a bytag element:
<?xml version="1.0"?> <nitf> <head> <title>Colombia Earthquake</title> </head> <body> <body.head> <hedline> <hl1>143 Dead in Colombia Earthquake</hl1> </hedline> <byline> <bytag>By Jared Kotler, Associated Press Writer</bytag> </byline> </body.head> </body> </nitf>
Note that bytag contains the full text of the byline, including the word “by” and the writer's affiliation, if required: in other words, the byline may contain any free-form text.
The story's dateline also belongs in the body.head element rather than in the main body.content. To insert a dateline, first create a dateline element, and then add location and story.date inside it:
<?xml version="1.0"?> <nitf> <head> <title>Colombia Earthquake</title> </head> <body> <body.head> <hedline> <hl1>143 Dead in Colombia Earthquake</hl1> </hedline> <byline> <bytag>By Jared Kotler, Associated Press Writer</bytag> </byline> <dateline> <location>Bogota, Colombia</location> <story.date>Monday January 25 1999 7:28 ET</story.date> </dateline> </body.head> </body> </nitf>
The location subelement contains the place from which the story was filed; the story.date subelement contains the date on which the story was filed.
At this point, all of the infrastructure and frontmatter for the story is in place, and all that remains is to fill in the story body itself. XMLNews-Story allows very rich inline markup if desired, but in the simplest case, you create a body.content element to hold the story text, and put the paragraphs within p elements:
<?xml version="1.0"?> <nitf> <head> <title>143 Dead in Colombia Earthquake</title> </head> <body> <body.head> <hedline> <hl1>143 Dead in Colombia Earthquake</hl1> </hedline> <byline> <bytag>By Jared Kotler, Associated Press Writer</bytag> </byline> <dateline> <location>Bogota, Colombia</location> <story.date>Monday January 25 1999 7:28 ET</story.date> </dateline> </body.head> <body.content> <p>An earthquake struck western Colombia on Monday, killing at least 143 people and injuring more than 900 as it toppled buildings across the country's coffee-growing heartland, civil defense officials said.</p> <p>The early afternoon quake had a preliminary magnitude of 6, according to the U.S. Geological Survey in Golden, Colo. Its epicenter was located in western Valle del Cauca state, 140 miles west of the capital, Bogota.</p> <p>The death and damage toll appeared to be highest in Armenia, Pereira and Calarca: three cities near the epicenter.</p> </body.content> </body> </nitf>
That's it: at this point, you have a complete and well-formed XMLNews-Story document.
The story developed in the previous section does not contain any information that could not just as easily have been represented in the old ANPA 1312 wire format: we simply used a different syntax to accomplish the same thing. While switching to a popular standard like XML brings many advantages, including the ability to take advantage of off-the-shelf systems and emerging web-browser support, XMLNews-Story also contains new features that can improve the quality of news handling and publication. The last three sections of this tutorial introduce some of the key new features included in XMLNews-Story, starting with rich inline markup.
Traditionally, news stories have included inline codes specifying how they shouldappear; XMLNews-Story goes a step further, and uses inline elements to say what things are, as in the following example:
<p>An <event>earthquake</event> struck <location>western <country>Colombia</country></location> on <chron norm="19990125">Monday</chron>, killing at least 143 people and injuring more than 900 as it toppled buildings across the country's coffee-growing heartland, <function>civil defense officials</function> said.</p>
None of this additional markup is required: you are free to leave it out (as in the sample story developed in the last section). When you include it, however, you make the news story much more valuable because news processing systems can extract more information automatically. Here are some examples:
Users might want to search for every story that mentions something that happened in the country Colombia during January 1999.
A news provider might turn the word “Colombia” into a hyperlink that brings up a map and a background page on the country (perhaps with suitable tied advertising).
A writer creating a background piece might want to find every story that mentions actual earthquakes (but not stories that use the word “earthquake” figuratively).
There are many more possibilities for using this rich inline markup, including (but not limited to) intelligent news filters, special typography, and daily news indices (“Countries in the News”, “Events in the News”).
It is important to understand how XMLNews-Story supports a layered approach to inline markup: news providers are not required to use any of these element types. For example, the following paragraph would be perfectly acceptable in a news story:
<p>Turkey's women were promised full equality by the Republic's founder Mustafa Kemal Ataturk more than 70 years ago. But despite the fact that they were given the vote in 1934, women today account for only two percent of the 550 deputies in Turkey's parliament.</p>
A more advanced news system might add inline markup to identify people, places, and organizations:
<p><location>Turkey</location>'s women were promised full equality by the Republic's founder <person>Mustafa Kemal Ataturk</person> more than 70 years ago. But despite the fact that they were given the vote in 1934, women today account for only two percent of the 550 deputies in <org>Turkey's parliament</org>.</p>
Finally, an archivist might add extremely rich markup to allow sophisticated searching, indexing, and analysis in an electronic library:
<p><location><country>Turkey</country></location>'s women were promised full equality by <function>the Republic's founder</function> <person><name.given>Mustafa Kemal</name.given> <name.family>Ataturk</name.family></person> more than 70 years ago. But despite the fact that they were given the vote in <chron norm="1934">1934</chron>, women today account for only two percent of the 550 deputies in <org>Turkey's parliament</org>.</p>
There are two major advantages to this layered approach to inline markup:
news providers can add exactly the right amount of markup for their abilities and their customers' needs; and
redistributors can add value to news stories by introducing additional tagging if there is a demand for it.
The rest of this section introduces the seventeen inline elements allowed (but not required) in the body of a news story.
chron, to tag a date or time;
copyrite, to tag a copyright statement;
event, to tag an event;
function, to tag a person's role or function;
location, to tag a geographical location;
money, to tag a money value of any sort;
num, to tag a numerical expression (including fractions);
object.title, to tag the title of a book, film, painting, etc.;
org, to tag the name of a government, department, company, charity, club, or any other organization;
person, to tag the name of a person (real or imaginary);
virtloc, to tag the name of a virtual location such as a URL or an e-mail address;
a, to tag an HTML link or the target of a link;
br, to force a line break;
em, to tag an emphasized phrase;
lang, to tag a phrase in a different language or dialect;
pronounce, to provide a phonetic pronunciation or guide to pronunciation; and
q, to tag a direct quotation.
The chron element marks a date and time in the text. You can use this simply to tag any text that refers to a specific date or time, or, if you have the information available, you can use the norm attribute to provide a normalized time in a restricted version of ISO 8601 format:
<p><chron norm="19990107">Today</chron>, <person>Bill Clinton</person> spoke to reporters about the situation in Iraq.</p>
The value of the norm attribute contains the normalized date and time, if available. The first eight characters of the attribute's value represent the date in YYYYYMMDD format, followed optionally by the letter “T” and the 24-hour time in HHMM[SS] format, followed by “Z” for Universal (Greenwich) Time or a +/- offset for any other time.
For example, 9:00AM on December 25, 1999 in New York city would appear as “19991225T0900-0500” or as “19991225T1400Z”.
A news publisher can use the norm attribute to perform automatic conversions to local time.
The copyrite element marks a copyright statement in the text. The statement consists of plain text, together with the copyright date and copyright holder:
<p><copyrite>Copyright <copyrite.year>1999</copyrite.year> by <copyrite.holder>The Daily News</copyrite.holder>. All rights reserved.</copyrite></p>
This element is particularly useful for generating collective copyright statements for a collection of stories, or for general rights management.
The event element marks the name of an event of any sort, as in the following example:
<p>The tech sector is nervously watching the <event>Microsoft trial</event>.</p>
An event is a specialised type of subject: events drive news, and the ability to distinguish events from the surrounding text allows news handling to be a lot smarter.
The function element marks a job title, activity, or any other role a person fills. The contents may or may not represent a formal title:
<p>Mourners left flowers to pay their respects to <person>Diana</person>, the <function>Princess of Wales</function>.</p>
Some functions, such as “the first man on the moon” or “ Prime Minister” are especially useful for searching and filtering stories.
The location element marks a geographical place in the text. The element may contain plain text together with special sublocation, city, state (for a state, province, or other similar administrative district), region, and country subelements for more specific tagging (if desired):
<p>He spoke on the history of <location><region>Great Lakes basin</region></location> at the <location><sublocation>Royal Ontario Museum</sublocation> in <city>Toronto</city></location>.</p>
Even at the simplest level, the location element helps to distinguish, for example, the Scottish city “Paisley” from the fabric design, or the country “China” from the tableware. At a more advanced level (when the subelements are also included), the location element allows highly-sophisticated news filtering and searching by geographical region, as well as automatically-generated links to maps or background pieces.
A creative news publisher might even automatically select advertising of interest to people reading about a specific location.
The money element marks a monetary item in the text. If desired, you may use the optional unit attribute to specify the currency, as in the following example:
<p>The property changed hands for <money unit="USD">$549,000</money>.</p>
The currency used for the monetary item, in ISO 4217 format (in other words “USD” for American dollars or “EUR” for the Euro).
When the money element includes a unit attribute, the receiver of a story can perform automatic conversions to local currency (from British pounds to U.S. dollars, for example).
The num element marks a numeric expression in the text. This element is more useful for rendering than for searching: in particular, it allows the special subelements sup for representing superscripts, sub for representing subscripts, and frac (with numer and denom) for representing fractions:
<p>The stock opened at <num>51 <frac><numer>15</numer> <denom>16</denom></frac></num>.</p>
The object.title element marks a formal title (such as the title of a book, song, or movie) in the text. This element allows only text as its content, so it is not possible to markup up titles within titles:
<p>Some analysts compared the recent events to the film <object.title>Wag the Dog</object.title>.</p>
This element serves two useful purposes. First, different publications generally have different typographical styles for titles: some might use italics, some might use boldface, and others might even use a different colour. With generic markup like this, each publication can use its own style automatically.
Secondly, as with many inline elements, the object.title element allows intelligent news filtering and searching: for example, it is possible to distinguish “Titanic” the ship from “Titanic” the movie. It is also possible to use the element for tied advertising or for automatically-generated hyperlinks.
The org element marks the name of any organization, such as a government, department, ministry, corporation, charity, or club. In addition to plain text, this element may contain a special, empty orgid subelement that uses the idsrc and value attributes to provide a machine-readable identifier for the organization:
<p><org>Nortel Networks <orgid idsrc="http://www.xmlnews.org/ns/orgids/tickers" value="NYSE:NT"></org> saw its stock fall in the face of the Brazilian devaluation.</p>
If orgid is present, the value of its idsrc attribute should be a fully-qualified URL.
News processing software can use this element for filtering and searching, or for more creative purposes such as generating an automatic link to a publicly-traded company's stock quotes.
The person element marks the name of a human individual (real or imaginary) in the text:
<p><person>Santa Claus</person> is coming to town.</p>
In addition to plain text, person allows the special subelements name.given and name.family, in addition to the general inline element function, for more specific tagging if desired:
<p><person><function>Prime Minister</function> <name.given>Tony</name.given> <name.family>Blair</name.family></person> will meet with the other <org>EU</org> leaders to discuss agricultural policy.</p>
The person element, again, allows for intelligent filtering and searching: you can distinguish “Bush” the shrub from “Bush” the former U.S. president, or “Gates” on doors from rich software developers.
The virtloc element marks a virtual location (such as a domain name, URL, or e-mail address) in the text:
<p>The White House encourages e-mail at <virtloc>president@whitehouse.gov</virtloc>.</p>
The virtloc element allows rendering software to use a special typeface or other visual effects for the text, and processing software to generate automatic hyperlinks.
The a element (taken from HTML) provides a live link or anchor in the text:
<p>The new Java release from <a href="http://www.sun.com/">Sun Microsystems</a> is due <chron norm="199902">next month</chron>.</p>
For printed rendition, the tags may simply be ignored, or the value of the href attribute may be added to the displayed text:
The new Java release from Sun Microsystems (http://www.sun.com/) is due out next month.
There are two attributes allowed:
A URL target: when this attribute has a value, the element acts as a link.
An identifier: when this attribute has a value, the element acts as a target for other links (both href and name may be specified).
The br element forces a line-break, as in HTML. This element type exists for rendering purposes only: usually, there are better ways to accomplish the same thing (such as the pre and table elements).
There are, however, a few special situations where br can be useful, particularly in addresses:
<delivery.point>Tools Division<br/> ACME Incorporated</delivery.point>
The em element marks emphasized phrase in text, as in HTML. The phrase may be rendered in many different ways, including a different colour and a different font size, slant, shape, or weight:
<p><q>The market is <em>not</em> in need of a correction,</q> she said.</p>
It is important not to use em to mark titles; use object.title instead, to allow for better news filtering and searching.
The lang element marks a phrase in a language different from the main body text; the lang attribute allows you to identify the language of the phrase, if desired:
<p><person>Michael Jordan</person>, <function>le basketteur vedette</function> aux six titres de champion <org>NBA</org> avec les <lang lang="en">Chicago Bulls</lang>, a annoncé officiellement <chron norm="19990113">mercredi</chron> sa retraite sportive.</p>
This element is also useful for distinguishing different variants of the same language:
<p>The police found the money in the <lang lang="en-US">trunk</lang>.</p>
The lang attribute contains an ISO 639/RFC 1766 language code with an optional geographic identifier, such as “en” for English, or “de-CH” for Swiss German. It can provide an important cue to processing software (such as search engines or spell-checkers) that the text needs to be treated differently; in some cases, it might even be desirable to provide automatic machine translation.
The pronounce element supplies a pronunciation for a word or phrase. The element itself is empty, but the guide and phonetic attributes provide useful information for a news reader:
<p>The cruise left from Gananoque<pronounce phonetic="gay-na-NAH-kway"/> and proceded through the <location>Thousand Islands</location>.</p>
The guide attribute contains prose instructions on the pronunciation of a word or phrase, while the phonetic attribute contains a phonetic spelling of a word or phrase.
The q element (from HTML) marks a direct quotation. You should use this element type instead of entering quotation marks in the text:
<p><q>I'm ready to try again,</q> she said.</p>
The q element type may be nested for quotations within quotations:
<p><q>He yelled <q>Put up your hands!</q> and then pointed his gun at me,</q> the victim told reporters.</p>
This element is useful for filtering and searching: for example, you can search for things that people have actually said about Tibet and separate them from other story information about Tibet.
The element is also useful for rendition, since different countries have different conventions for quotation marks. For example, a British newspaper would render the above example as
'He yelled "Put up your hands!" and then pointed his gun at me,' the victim told reporters.
A North American newspaper, on the other hand, would render the example as
"He yelled 'Put up your hands!' and then pointed his gun at me," the victim told reporters.
Often, news distributors receive tabular information as plain, space-delimited text, and simply publish it literally in a fixed-width font, as in the following example:
Ticker Last Trade Change Volume ---------------------------------------------------- YHOO Feb 12 151 -7 1/2 -4.73% 6,218,000 EBAY Feb 12 236 -3 1/2 -1.46% 1,211,600 AMZN Feb 12 104 1/2 -5 3/8 -4.89% 3,882,700 MSFT Feb 12 157 3/4 -5 -3.07% 15,732,600 AOL Feb 12 158 1/2 -6 3/16 -3.76% 11,832,900
Often, this type of presentation is the only choice without manually reformatting the information, but it has two obvious disadvantages:
the information is difficult to reformat or typeset attractively without manual reformatting; and
the information is difficult to extract automatically (say, for use in a spreadsheet or database).
To help avoid these problems, XMLNews-Story provides standard markup for tabular information based on the HTML table model. This section introduces the XMLNews-Story table model by demonstrating, step-by-step, how to add the following table to an XMLNews-Story document:
Ticker Last Trade Change Volume YHOO Feb 12 151 -7 1/2 -4.73% 6,218,000 EBAY Feb 12 236 -3 1/2 -1.46% 1,211,600 AMZN Feb 12 104 1/2 -5 3/8 -4.89% 3,882,700 MSFT Feb 12 157 3/4 -5 -3.07% 15,732,600 AOL Feb 12 158 1/2 -6 3/16 -3.76% 11,832,900
In XMLNews-Story, tables may not appear directly within body.content; you have to surround them with a block element. A block is a special construction that can contain tables, media objects, or mini-documents (such as a news brief). Here is a minimal XMLNews-Story document with a block containing an empty table element:
<?xml version="1.0"?> <nitf> <head> <title>Stock Quotes</title> </head> <body> <body.content> <block> <table> </table> </block> </body.content> </body> </nitf>
Now that the table element is in place, you can add the two top-level subelements: the table header thead (which is optional), and the table body tbody (which is required).
<table> <thead> </thead> <tbody> </tbody> </table>
Now it's time to create the header row for the table. The tr element represents a single row in the table, while the th element represents a table header cell:
<table> <thead> <tr> <th>Ticker</th> <th colspan="2">Last Trade</th> <th colspan="2">Change</th> <th>Volume</th> </tr> </thead> <tbody> </tbody> </table>
The table has four headings, “Ticker”, “Last Trade”, “Change”, and “Volume”. However, the second and third headings need to span two rows each: under “Last Trade”, the table contains both the date of the last trade and the price of the stock; under “Change”, the table contains the price change both as an absolute dollar amount and as a percentage. The colspan attribute allows any table cell (header or otherwise) to span more than one column of the table.
The table body also uses the tr element to represent rows. Each tr element may contain both th elements (table header cells) and td elements (regular table cells). Since there are five rows in the table (not counting the header row) and six columns, you need to insert five tr elements, each containing six td elements for the individual cells in the row:
<table> <thead> <tr> <th>Ticker</th> <th colspan="2">Last Trade</th> <th colspan="2">Change</th> <th>Volume</th> </tr> </thead> <tbody> <tr> <td>YHOO</td> <td>Feb 12</td> <td>151</td> <td>-7 1/2</td> <td>-4.73%</td> <td align="right">6,218,000</td> </tr> <tr> <td>EBAY</td> <td>Feb 12</td> <td>236</td> <td>-3 1/2</td> <td>-1.46%</td> <td align="right">1,211,600</td> </tr> <tr> <td>AMZN</td> <td>Feb 12</td> <td>104 1/2</td> <td>-5 3/8</td> <td>-4.89%</td> <td align="right">3,882,700</td> </tr> <tr> <td>MSFT</td> <td>Feb 12</td> <td>157 3/4</td> <td>-5</td> <td>-3.07%</td> <td align="right">15,732,600</td> </tr> <tr> <td>AOL</td> <td>Feb 12</td> <td>158 1/2</td> <td>-6 3/16</td> <td>-3.76%</td> <td align="right">11,832,900</td> </tr> </tbody> </table>
Note that the last td element in every row has the attribute align set to the value “right”, to specify that the contents of that cell should be right-aligned.
Now that the body is filled in, you have a complete NITF (and HTML) table.
News stories do not consist only of text or tables: traditional news stories often contain photographs or illustrations, and with the growth of the Internet, it is also possible to add sound and video to a story. This section provides an overview of how to add five types of media to NITF news stories:
images;
photos;
audio clips; and
video clips.
External media may be included only inside a block or td (table cell) element, not at the top level of the body.content element:
<body.content> <p>[text]</p> <block> <photo src="http://www.host.net/photo.png"> <photo.data copyright="Copyright (c) 1998 by host.net"/> </photo> </block> <p>[text]</p> </body.content>
Include a computer image. The empty img.data subelement is required, though its copyright attribute is optional:
<img src="welcome.jpg"> <img.data copyright="Copyright (c) 1999 by Web Designs Unlimited"/> </img>
The img element may optionally include a caption and information on the producer:
<img src="welcome.jpg"> <img.caption> <caption>Welcome to our web site.</caption> </img.caption> <img.producer> <byline> <bytag>Produced by Web Designs Unlimited</bytag> </byline> </img.producer> <img.data copyright="Produced by Web Designs Unlimited"/> </img>
This required attribute provides the URL for the computer image.
This optional attribute provides the image height in pixels.
This optional attribute provides the image width in pixels.
Include an audio clip:
<audio src="http://www.acme-news.com/clips/flood.ram"> <audio.data copyright="Copyright (c) 1998 by ACME News"/> </audio>
The optional audio.caption and audio.producer elements allow you to supply extra information about the clip, and the optional length attribute allows you to specify the playing time of the clip:
<audio src="http://www.acme-news.com/clips/flood.ram" length="00:02:39"> <audio.caption> <caption><person>Ace Freelance</person> reports on the flooding.</caption> </audio.caption> <audio.producer> <byline> <bytag>ACME News</bytag> </byline> </audio.producer> <audio.data copyright="Copyright (c) 1998 by ACME News"/> </audio>
Provide the URL for the file containing the actual audio clip.
Optionally provide the playing time of the audio clip in hh:mm:ss format.
Include a video clip in the news story:
<video src="http://www.acme-news.com/clips/flood.ram"> <video.data copyright="Copyright (c) 1998 by ACME News"/> </video>
The optional video.caption and video.producer elements allow you to supply extra information about the clip, and the optional length attribute allows you to specify the playing time of the clip:
<video src="http://www.acme-news.com/clips/flood.mpg" length="00:04:49"> <video.caption> <caption><person>Ace Freelance</person> reports on the flooding.</caption> </video.caption> <video.producer> <byline> <bytag>ACME News</bytag> </byline> </video.producer> <video.data copyright="Copyright (c) 1998 by ACME News"/> </video>
This required attribute provides the URL for the file containing the video clip.
This optional attribute provides the playing time of the clip in hh:mm:ss format.
This final section contains a longer sample NITF news story, with different markup densities.
This sample contains only the minimal markup necessary for an NITF news story: it distinguishes the headline, byline, and dateline from the rest of the story, and it uses no inline markup except q (for direct quotations).
<?xml version="1.0"?> <nitf> <head> <title>Snow, Freezing Rain Batter U.S. Northeast</title> </head> <body> <body.head> <hedline> <hl1>Snow, Freezing Rain Batter U.S. Northeast</hl1> </hedline> <byline> <bytag>By Matthew Lewis</bytag> </byline> <dateline> <location>HARTFORD, Conn.</location> <story.date>Friday January 15 12:27 PM ET</story.date> </dateline> </body.head> <body.content> <p>Snow and freezing rain punished the northeastern United States for a second straight day on Friday, causing at least five weather-related deaths, closing airports and spreading misery from Washington, D.C., to Canada.</p> <p>Below a snow line that bisected Maryland, Pennsylvania and New Jersey, a predawn downpour turned road surfaces to ice. The icy buildup also brought down power lines, leaving hundreds of thousands of people without electricity.</p> <p><q>This is one of the most severe storms we've seen in a long time,</q> said a spokeswoman for Baltimore Gas and Electric. <q>We're not making any promises about when all the power will be restored because we're still trying to find all the damage.</q></p> <p>Some 126,000 people were left without power in Pennsylvania, southern New Jersey and northern Maryland, officials said.</p> <p>Farther south, in the Maryland and Virginia suburbs of Washington, D.C., tree limbs in some neighborhoods were weighted down by a half-inch (1.2 cm) of ice and more than 300,000 residents were without power on Friday, prompting the federal government to give workers the day off.</p> <p>Despite the poor weather conditions, President Clinton pushed ahead on Friday morning with his scheduled trip to New York City for a conference on minorities with civil rights activist the Rev. Jesse Jackson. Bad weather had forced him to delay his Thursday night departure.</p> <p>Airplane traffic was at a virtual standstill in Boston; Hartford, Connecticut; Manchester, New Hampshire; and Providence, Rhode Island, early on Friday, and flights at all three of New York City's major airports were canceled because of ice.</p> <p>Officials in New York said they expected planes would be able to take off later in the day as temperatures rose.</p> <p><q>What we're looking at is a slow start,</q> said Bill Cahill, spokesman for the New York Metropolitan Transportation Authority.</p> <p>Amtrak also reported delays of up to three hours on trains traveling in and out of New York City on the busy Northeast corridor due to power outages. And local commuter rails were experiencing delays.</p> <p><q>We expect lots of delays today, but the difference between yesterday and today is that today you expect things to improve as the day goes on,</q> Cahill added.</p> <p>Meanwhile, heavy rain near the Atlantic coast spawned local flooding problems for resort communities along the New Jersey and Maryland shorelines.</p> <p>The weather also claimed several lives. Three people were killed when their car swerved on ice and snow and smashed into a tractor-trailer near Clarksburg, West Virginia.</p> <p>In Auburn, Massachusetts, one person was killed during a morning rush hour traffic accident, and another was killed when his dump truck hit an overhead power line on Cape Cod.</p> <p>Snow-clogged Toronto braced for another storm system in what has already been the snowiest month of the century for Canada's largest city.</p> <p>Downtown Toronto, usually snarled with traffic, looked like a snowy ghost town on Friday morning with only a few cars managing the streets and some hardy pedestrians climbing over huge drifts to get to their workplaces. More snow was expected on Saturday.</p> <p>Northeast Ohio also saw snow flurries, adding to the 2-8 inches (5-20 cm) of snow dropped on the area overnight, making the morning commute messy, with delays caused by slippery roads and minor accidents.</p> <p>In Detroit, where residents struggled to clear 24 inches (60 cm) of snow, a snow emergency was in effect, many residential streets were unplowed and schools remained closed.</p> <p>Detroit Mayor Dennis Archer and his staff planned to shovel snow from the porches and sidewalks of elderly citizens on Friday.</p> </body.content> </body> </nitf>
This version of the story is a little more complicated than the previous one: it adds inline markup to distinguish locations (location), organisations (org), people (person), and people's roles (function) from the surrounding text. This is probably the highest level of markup practical for a continuous newsfeed with current technology.
<?xml version="1.0"?> <nitf> <head> <title>Snow, Freezing Rain Batter U.S. Northeast</title> </head> <body> <body.head> <hedline> <hl1>Snow, Freezing Rain Batter <location>U.S. Northeast</location></hl1> </hedline> <byline> <bytag>By Matthew Lewis</bytag> </byline> <dateline> <location><city>HARTFORD</city>, <state>Conn.</state></location> <story.date>Friday January 15 12:27 PM ET</story.date> </dateline> </body.head> <body.content> <p>Snow and freezing rain punished the <location>northeastern United States</location> for a second straight day on Friday, causing at least five weather-related deaths, closing airports and spreading misery from <location>Washington, D.C.</location>, to <location>Canada</location>.</p> <p>Below a snow line that bisected <location>Maryland</location>, <location>Pennsylvania</location> and <location>New Jersey</location>, a predawn downpour turned road surfaces to ice. The icy buildup also brought down power lines, leaving hundreds of thousands of people without electricity.</p> <p><q>This is one of the most severe storms we've seen in a long time,</q> said a <function>spokeswoman</function> for <org>Baltimore Gas and Electric</org>. <q>We're not making any promises about when all the power will be restored because we're still trying to find all the damage.</q></p> <p>Some 126,000 people were left without power in <location>Pennsylvania</location>, <location>southern New Jersey</location> and <location>northern Maryland</location>, officials said.</p> <p>Farther south, in the <location>Maryland</location> and <location>Virginia</location> suburbs of <location>Washington, D.C.</location>, tree limbs in some neighborhoods were weighted down by a half-inch (1.2 cm) of ice and more than 300,000 residents were without power on Friday, prompting the <org>federal government</org> to give workers the day off.</p> <p>Despite the poor weather conditions, <person><function>President</function> Clinton</person> pushed ahead on Friday morning with his scheduled trip to <location>New York City</location> for a conference on minorities with <function>civil rights activist</function> the <person>Rev. Jesse Jackson</person>. Bad weather had forced him to delay his Thursday night departure.</p> <p>Airplane traffic was at a virtual standstill in <location>Boston</location>; <location>Hartford, Connecticut</location>; <location>Manchester, New Hampshire</location>; and <location>Providence, Rhode Island</location>, early on Friday, and flights at all three of <location>New York City</location>'s major airports were canceled because of ice.</p> <p>Officials in <location>New York</location> said they expected planes would be able to take off later in the day as temperatures rose.</p> <p><q>What we're looking at is a slow start,</q> said <person>Bill Cahill</person>, spokesman for the <org>New York Metropolitan Transportation Authority</org>.</p> <p><org>Amtrak</org> also reported delays of up to three hours on trains traveling in and out of <location>New York City</location> on the busy Northeast corridor due to power outages. And local commuter rails were experiencing delays.</p> <p><q>We expect lots of delays today, but the difference between yesterday and today is that today you expect things to improve as the day goes on,</q> Cahill added.</p> <p>Meanwhile, heavy rain near the <location>Atlantic coast</location> spawned local flooding problems for resort communities along the <location>New Jersey</location> and <location>Maryland</location> shorelines.</p> <p>The weather also claimed several lives. Three people were killed when their car swerved on ice and snow and smashed into a tractor-trailer near <location>Clarksburg, West Virginia</location>.</p> <p>In <location>Auburn, Massachusetts</location>, one person was killed during a morning rush hour traffic accident, and another was killed when his dump truck hit an overhead power line on <location>Cape Cod</location>.</p> <p>Snow-clogged <location>Toronto</location> braced for another storm system in what has already been the <event>snowiest month of the century</event> for <location>Canada</location>'s largest city.</p> <p><location>Downtown Toronto</location>, usually snarled with traffic, looked like a snowy ghost town on Friday morning with only a few cars managing the streets and some hardy pedestrians climbing over huge drifts to get to their workplaces. More snow was expected on Saturday.</p> <p><location>Northeast Ohio</location> also saw snow flurries, adding to the 2-8 inches (5-20 cm) of snow dropped on the area overnight, making the morning commute messy, with delays caused by slippery roads and minor accidents.</p> <p>In <location>Detroit</location>, where residents struggled to clear 24 inches (60 cm) of snow, a <event>snow emergency</event> was in effect, many residential streets were unplowed and schools remained closed.</p> <p><function>Detroit Mayor</function> <person>Dennis Archer</person> and his staff planned to shovel snow from the porches and sidewalks of elderly citizens on Friday.</p> </body.content> </body> </nitf>
The last version of the story contains the densest markup possible, distinguishing individual components (such as countries and regions) within locations and individual name components within people's names, and providing normalised versions of all times.
<?xml version="1.0"?> <nitf> <head> <title>Snow, Freezing Rain Batter U.S. Northeast</title> </head> <body> <body.head> <hedline> <hl1>Snow, Freezing Rain Batter <location><country>U.S.</country> <region>Northeast</region></location></hl1> </hedline> <byline> <bytag>By Matthew Lewis</bytag> </byline> <dateline> <location><city>HARTFORD</city>, <state>Conn.</state></location> <story.date>Friday January 15 12:27 PM ET</story.date> </dateline> </body.head> <body.content> <p>Snow and freezing rain punished the <location>northeastern <country>United States</country></location> for a second straight day on <chron norm="19990115">Friday</chron>, causing at least five weather-related deaths, closing airports and spreading misery from <location><city>Washington</city>, <state>D.C.</state></location>, to <location><country>Canada</country></location>.</p> <p>Below a snow line that bisected <location><state>Maryland</state></location>, <location><state>Pennsylvania</state></location> and <location><state>New Jersey</state></location>, a <chron norm="19990115">predawn</chron> downpour turned road surfaces to ice. The icy buildup also brought down power lines, leaving hundreds of thousands of people without electricity.</p> <p><q>This is one of the most severe storms we've seen in a long time,</q> said a <function>spokeswoman</function> for <org>Baltimore Gas and Electric</org>. <q>We're not making any promises about when all the power will be restored because we're still trying to find all the damage.</q></p> <p>Some 126,000 people were left without power in <location><state>Pennsylvania</state></location>, <location>southern <state>New Jersey</state></location> and <location>northern <state>Maryland</state></location>, officials said.</p> <p>Farther south, in the <location><state>Maryland</state></location> and <location><state>Virginia</state></location> suburbs of <location><city>Washington</city>, <state>D.C.</state></location>, tree limbs in some neighborhoods were weighted down by a half-inch (1.2 cm) of ice and more than 300,000 residents were without power on <chron norm="19990115">Friday</chron>, prompting the <org>federal government</org> to give workers the day off.</p> <p>Despite the poor weather conditions, <person><function>President</function> <name.family>Clinton</name.family></person> pushed ahead on <chron norm="19990115">Friday morning</chron> with his scheduled trip to <location><city>New York City</city></location> for a conference on minorities with <function>civil rights activist</function> the <person><function>Rev.</function> <name.given>Jesse</name.given> <name.family>Jackson</name.family></person>. Bad weather had forced him to delay his <chron norm="19990114">Thursday night</chron> departure.</p> <p>Airplane traffic was at a virtual standstill in <location><city>Boston</city></location>; <location><city>Hartford</city>, <state>Connecticut</state></location>; <location><city>Manchester</city>, <state>New Hampshire</state></location>; and <location><city>Providence</city>, <state>Rhode Island</state></location>, early on <chron norm="19990115">Friday</chron>, and flights at all three of <location><city>New York City</city>'s <sublocation>major airports</sublocation></location> were canceled because of ice.</p> <p>Officials in <location><city>New York</city></location> said they expected planes would be able to take off <chron norm="19990115">later in the day</chron> as temperatures rose.</p> <p><q>What we're looking at is a slow start,</q> said <person><name.given>Bill</name.given> <name.family>Cahill</name.family></person>, spokesman for the <org>New York Metropolitan Transportation Authority</org>.</p> <p><org>Amtrak</org> also reported delays of up to three hours on trains traveling in and out of <location><city>New York City</city></location> on the busy <location>Northeast corridor</location> due to power outages. And local commuter rails were experiencing delays.</p> <p><q>We expect lots of delays <chron norm="19990115">today</chron>, but the difference between <chron norm="19990114">yesterday</chron> and today is that today you expect things to improve as the day goes on,</q> Cahill added.</p> <p>Meanwhile, heavy rain near the <location><region>Atlantic coast</region></location> spawned local flooding problems for <location><sublocation>resort communities</sublocation> along the <state>New Jersey</state> and <state>Maryland</state> <region>shorelines</region></location>.</p> <p>The weather also claimed several lives. Three people were killed when their car swerved on ice and snow and smashed into a tractor-trailer near <location><city>Clarksburg</city>, West Virginia<state></state></location>.</p> <p>In <location><city>Auburn</city>, <state>Massachusetts</state></location>, one person was killed during a morning rush hour traffic accident, and another was killed when his dump truck hit an overhead power line on <location>Cape Cod<region></region></location>.</p> <p>Snow-clogged <location><city>Toronto</city></location> braced for another storm system in what has already been the <event>snowiest month of the century</event> for <location><country>Canada</country></location>'s largest city.</p> <p><location><sublocation>Downtown</sublocation> <city>Toronto</city></location>, usually snarled with traffic, looked like a snowy ghost town on <chron norm="19990115">Friday morning</chron> with only a few cars managing the streets and some hardy pedestrians climbing over huge drifts to get to their workplaces. More snow was expected on <chron norm="19990116">Saturday</chron>.</p> <p><location>Northeast <state>Ohio</state></location> also saw snow flurries, adding to the 2-8 inches (5-20 cm) of snow dropped on the area overnight, making the morning commute messy, with delays caused by slippery roads and minor accidents.</p> <p>In <location><city>Detroit</city></location>, where residents struggled to clear 24 inches (60 cm) of snow, a <event>snow emergency</event> was in effect, many residential streets were unplowed and schools remained closed.</p> <p><function>Detroit Mayor</function> <person><name.given>Dennis</name.given> <name.family>Archer</name.family></person> and his staff planned to shovel snow from the porches and sidewalks of elderly citizens on <chron norm="19990115">Friday</chron>.</p> </body.content> </body> </nitf>