Categories
Archives
The IPTC NewsCodes Working Group is pleased to announce the latest release of the IPTC NewsCodes, our set of controlled vocabularies for the news industry.
Updates this time span many vocabularies, with the biggest updates to Media Topic and Digital Source Type.
Media Topic updates
Most of the recent work has been in the politics branch.
3 new concepts: by-election, recall election, coalition building
2 retired concepts: political campaigns, church elections
4 modified concept names (in English): voting system, referendum, fundamental rights, football (yes we finally refer to the sport as “football” in en-GB and “soccer” in en-US!)
Modified concept definitions: 22 civil rights, election, voting system, intergovernmental elections, local elections, primary elections, referendum, regional elections, voting, fundamental rights, censorship and freedom of speech, freedom of religion, freedom of the press, human rights, football, political debates, privacy, women’s rights, breaking (breakdance)
1 hierarchy move: fundamental rights has been moved from politics to society.
Also, the Wikidata mapping URIs have all been changed to point to the http://
version of the URI instead of the https://
version. This follows the official Wikidata guidance.
See the official Media Topic vocabulary on the IPTC Controlled Vocabulary server, and an easier-to-navigate tree view. An Excel version of IPTC Media Topics is also available.
Digital Source Type updates
5 new concepts have been added:
- Multi-frame computational capture sampled from real life, intended to cover media recorded by modern cameras and smartphones that may process several captured images together to create the saved media file, without any interaction with the photographer.
- Human-edited media, intended to replace the retired Original media with minor human edits, given that it is subjective to decide what is a “minor” edit.
- Digital creation, intended to replace the retired Digital art so that we can avoid the existential question of “what is art?”
- Screen capture, covering screenshots and screen recordings made on a device
- Composite of elements, as a generic form of the more specific “composite” terms.
2 concepts have been retired: Original media with minor human edits, and Digital art, as explained above.
8 concepts have had their names and definitions modified, while retaining the same machine-readable ID for backwards-compatibility purposes:
- Digital capture sampled from real life (ID: digitalCapture), replacing the previous name “Original digital capture sampled from real life”
- Digitised from a transparent negative (ID: negativeFilm), replacing the previous name “Digitised from a negative on film”
- Digitised from a transparent positive (ID: positiveFilm), replacing the previous name “Digitised from a positive on film”
- Digitised from a non-transparent medium (ID: print), replacing the previous name “Digitised from a print on non-transparent medium”
- Edited using Generative AI (ID: compositeWithTrainedAlgorithmicMedia), replacing the previous name “Composite with Trained algorithmic media”
- Algorithmically-altered media (ID: algorithmicallyEnhanced), replacing the previous name “Algorithmically Enhanced”
- Created using Generative AI (ID: trainedAlgorithmicMedia), replacing the previous name “Trained Algorithmic Media”
- Virtual event recording (ID: virtualRecording), replacing the previous name “Virtual recording”
Our thanks go to IPTC representatives and experts from Partnership on AI, Google, Adobe, C2PA, CIPA and many others on making these updates to our vocabulary, which is now widely used to identify Generative AI content.
Updates to other NewsCodes vocabularies
Alternative Identifier Role (altidrole)
- Vocabulary’s name changed to fix a spelling mistake.
- New concept: IPTC Video Metadata Hub ID (altidrole:vmhVideoId)
Event Occur Status (eocstat)
- Fix spelling mistake “occurence” -> “occurrence” throughout.
Golf Shot (spgolshot)
- New concept: Chip (spgolshot:chip)
Rights Property (rightsprop)
- New concept: Copyright Year (rightsprop:copyrightyear)
- 4 modified definitions: Minor Model Age Disclosure, Model Release Id, Model Release Status, Property Release Status.
Sports Concept (spct)
- New concept: Recurring Competition (spct:recurring-competition)
- New concept: Governing Body (spct:governing-body)
The IPTC NewsCodes Working Group has released the latest update to IPTC NewsCodes vocabularies.
The changes are quite minor this time, but we still recommend that users stay up to date with the latest version.
Changes to Media Topics vocabulary
Our main subject classification taxonomy, IPTC Media Topics, has seen the following updates:
1 new concept
- breaking (breakdance) (added earlier this year in time for the Paris 2024 Olympics)
1 retired concept
- missing in action (duplicate term added in error in the 2024 Q1 update. The existing term missing in action medtop:20000061 was moved to replace the newer term))
32 modified definitions
These changes mostly correct spelling errors in en-GB where US spellings had slipped in, such as changing “behavior” to “behaviour” for en-GB:
wireless technology, tobacco and nicotine, economic trends and indicators, international economic institution, stocks and securities, adult and continuing education, upper secondary education, social learning, medical condition, Confucianism, relations between religion and government, road cycling, competitive dancing, sexual misconduct, developmental disorder, fraternal and community group, cyber warfare, public transport, taxi and ride-hailing, shared transport, business reporting and performance, business restructuring, commercial real estate, residential real estate, podcast, financial service, business service, news industry, diversity, equity and inclusion, sustainability, profit sharing, breaking (breakdance).
As usual, the Media Topics vocabularies can be viewed in the following ways:
- In a collapsible tree view
- As a downloadable Excel spreadsheet
- On one page on the cv.iptc.org server
- In machine readable formats such as RDF/XML and Turtle using the SKOS vocabulary format: see the cv.iptc.org guidelines document for more detail.
Updates to other vocabularies
Horse Position (sphorposition)
New term “trainer” added to https://cv.iptc.org/newscodes/sphorposition. This term is needed by IPTC Sport Schema.
For more information on IPTC NewsCodes in general, please see the IPTC NewsCodes Guidelines.
The IPTC News Architecture Working Group is happy to announce the release of NewsML-G2 version 2.34.
This version, approved at the IPTC Standards Committee Meeting at the New York Times offices on Wednesday 17th April 2024, contains one small change and one additional feature:
Change Request 218, increase nesting of <related> tags: this allows for <related> items to contain child <related> items, up to three levels of nesting. This can be applied to many NewsML-G2 elements:
- pubHistory/published
- QualRelPropType (used in itemClass, action)
- schemeMeta
- ConceptRelationshipsGroup (used in concept, event, Flex1PropType, Flex1RolePropType, FlexPersonPropType, FlexOrganisationPropType, FlexGeoAreaPropType, FlexPOIPropType, FlexPartyPropType, FlexLocationPropType)
Note that we chose not to allow for recursive nesting because this caused problems with some XML code generators and XML editors.
Change Request 219, add dataMining element to rightsinfo: In accordance with other IPTC standards such as the IPTC Photo Metadata Standard and Video Metadata Hub, we have now added a new element to the <rightsInfo> block to convey a content owner’s wishes in terms of data mining of the content. We recommend the use of the PLUS Vocabulary that is also recommended for the other IPTC standards: https://ns.useplus.org/LDF/ldf-XMPSpecification#DataMining
Here are some examples of its use:
Denying all Generative AI / Machine Learning training using this content:
<rightsInfo> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING"/> </rightsInfo>
A simple text-based constraint:
<rightsInfo> <usageTerms> Data mining allowed for academic and research purposes only. </usageTerms> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
A simple text based constraint, expressed using a QCode instead of a URI:
<rightsInfo> <usageTerms> Reprint rights excluded. </usageTerms> <dataMining qcode="plusvocab:DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
A text-based constraint expressed in both English and French:
<rightsInfo> <usageTerms xml:lang="en"> Reprint rights excluded. </usageTerms> <usageTerms xml:lang="fr"> droits de réimpression exclus </usageTerms> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" /> </rightsInfo>
Using the “see embedded rights expression” constraint to express a complex machine-readable rights expression in RightsML:
<rightsInfo> <rightsExpressionXML langid="http://www.w3.org/ns/odrl/2/"> <!-- RightsML goes here... --> </rightsExpressionXML> <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR"/>> </rightsInfo>
For more information, contact the IPTC News Architecture Working Group via the public NewsML-G2 mailing list.
The IPTC Sports Content Working Group is happy to announce the release of IPTC Sport Schema version 1.0.
The first new IPTC standard to be released in more than 10 years, IPTC Sport Schema is a comprehensive model for the storage, transmission and querying of sports data. It has been tested on real-world use cases that are common in any newsroom or sports organisation.
IPTC Sport Schema has evolved from its predecessor SportsML. In contrast to the document-oriented nature of SportsML, IPTC Sport Schema takes a data-centric approach which is better suited to systems dealing with large volumes of data and also helps with integration across data sets.
“We reached out to many companies dealing with sports content and built up a clear picture of their needs,” says IPTC Sports Content Working Group lead Paul Kelly. “They wanted up-to-date formats, easy querying, the ability to handle e-sports and the ability to cross-reference between different media and data silos. IPTC Sport Schema addresses those requirements with a new basic model at the abstract end, and adhering to common use cases to keep things grounded.”
Content in Sports Schema is represented in the W3C’s universal Resource Description Framework (RDF), which renders any kind of data as a triple in the form of subject->predicate->object. Each component of a Sports Schema triple has a reference to an ontology, which defines the model at the heart of the standard. Querying is done using the W3C’s SPARQL standard, a kind of SQL for RDF.
“The IPTC has been working on RDF and semantic web standards for more than 10 years, going back to rNews and RightsML,” said IPTC Managing Director Brendan Quinn. “So we are very happy to release another semantic standard that can help organisations to publish and share sports data in a vendor-neutral, interoperable way.”
Being RDF-based, IPTC Sport Schema can be rendered in XML, JSON and the simple Turtle format, and can be converted easily between all three formats using free tools such as Apache Jena.
“Those familiar with SportsML or SportsJS should recognise the basic components of Sport Schema,” says Kelly, “both in the ontology and in the sports vocabularies introduced with SportsML 3.0, which were designed specifically with semantic technologies in mind.”
To support take-up and share information about the new standard, the IPTC has created a dedicated website, sportschema.org. The site contains:
- a list of use cases which were used to help design the schema and data structures
- example instance diagrams for various sports to help understand how the model can be applied to team, individual and other types of sports
- a data dictionary comparing IPTC Sport Schema to other prominent sport schemas (SportsML, ODF, BBC Ontology, etc.)
- A detailed and comprehensive IPTC Sport Schema ontology reference showing all classes, relationships and properties.
- A tool to validate Sport Schema data using the SHACL format to ensure RDF triples adhere to the specification (equivalent to XML Schema or JSON Schema)
- A tool to covert SportsML documents to IPTC Sport Schema data
- A set of unit tests and sample data files that were used to develop and maintain Sport Schema, including a bespoke unit test framework that ensures our example SPARQL queries continue to satisfy our use cases as the model evolves.
Those wishing to try out some SPARQL queries against some sports data should visit Sport Schema’s query endpoint. It includes example queries showing how to build a team roster, league standings and more from our sample data sets.
For more information on IPTC Sport Schema, see the IPTC’s landing pages on the IPTC Sport Schema standard, the standalone site sportschema.org, or the project’s GitHub repository.
If you are interested in joining those who are working on implementing IPTC Sport Schema in your project or your organisation, we would love to hear from you. Please contact us via IPTC’s contact form.
Following the IPTC’s recent announcement that Rights holders can exclude images from generative AI with IPTC Photo Metadata Standard 2023.1 , the IPTC Video Metadata Working Group is very happy to announce that the same capability now exists for video, through IPTC Video Metadata Hub version 1.5.
The “Data Mining” property has been added to this new version of IPTC Video Metadata Hub, which was approved by the IPTC Standards Committee on October 4th, 2023. Because it uses the same XMP identifier as the Photo Metadata Standard property, the existing support in the latest versions of ExifTool will also work for video files.
Therefore, adding metadata to a video file that says it should be excluded from Generative AI indexing is as simple as running this command in a terminal window:
exiftool -XMP-plus:DataMining="Prohibited for Generative AI/ML training" example-video.mp4
(Please note that this will only work in ExifTool version 12.67 and above, i.e. any version of ExifTool released after September 19, 2023)
The possible values of the Data Mining property are listed below:
PLUS URI | Description (use exactly this text with ExifTool) |
Unspecified – no prohibition defined | |
Allowed | |
Prohibited for AI/ML training | |
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-GENAIMLTRAINING |
Prohibited for Generative AI/ML training |
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-EXCEPTSEARCHENGINEINDEXING |
Prohibited except for search engine indexing |
Prohibited | |
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT |
Prohibited, see plus:OtherConstraints |
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR |
Prohibited, see iptcExt:EmbdEncRightsExpr |
http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEELINKEDRIGHTSEXPR |
Prohibited, see iptcExt:LinkedEncRightsExpr |
A corresponding new property “Other Constraints” has also been added to Video Metadata Hub v1.5. This property allows plain-text human-readable constraints to be placed on the video when using the “Prohibited, see plus:OtherConstraints” value of the Data Mining property.
The Video Metadata Hub User Guide and Video Metadata Hub Generator have also been updated to include the new Data Mining property added in version 1.5.
We look forward to seeing video tools (and particularly crawling engines for generative AI training systems) implement the new properties.
Please feel free to discuss the new version of Video Metadata Hub on the public iptc-videometadata discussion group, or contact IPTC via the Contact us form.
The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.
Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.
Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:
ID (in QCode format) | Name | Example |
digsrctype:digitalCapture | Original digital capture sampled from real life:
The digital media is captured from a real-life source using a digital camera or digital recording device |
Digital video taken using a digital film, video or smartphone camera |
digsrctype:negativeFilm | Digitised from a negative on film:
The digital image was digitised from a negative on film on any other transparent medium |
Digital photo scanned from a photographic negative |
digsrctype:minorHumanEdits | Original media with minor human edits:
Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine |
Original audio with minor edits (e.g. to eliminate breaks) |
digsrctype:algorithmicallyEnhanced | Algorithmic enhancement: Minor augmentation or correction by algorithm |
A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature |
digsrctype:dataDrivenMedia | Data-driven media: Digital media representation of data via human programming or creativity |
Textual weather report generated by code using readings from weather detection instruments |
digsrctype:trainedAlgorithmicMedia | Trained algorithmic media: Digital media created algorithmically using a model derived from sampled content |
A “deepfake” video using a combination of a real actor and a trained model
|
The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.
Guidance on using Digital Source Type
The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.
Where Digital Source Type can be used in NewsML-G2 documents
The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.
It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.
Find out more about NewsML-G2 v2.33
All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.
The NewsML-G2 Specification document has been updated to cover the new version 2.33.
Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/.
Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/
XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.
The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.
For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.
Updated in June 2024 to include an image containing the new metadata property
Many image rights owners noticed that their assets were being used as training data for generative AI image creators, and asked the IPTC for a way to express that such use is prohibited. The new version 2023.1 of the IPTC Photo Metadata Standard now provides means to do this: a field named “Data Mining” and a standardised list of values, adopted from the PLUS Coalition. These values can show that data mining is prohibited or allowed either in general, for AI or Machine Learning purposes or for generative AI/ML purposes. The standard was approved by IPTC members on 4th October 2023 and the specifications are now publicly available.
Because these data fields, like all IPTC Photo Metadata, are embedded in the file itself, the information will be retained even after an image is moved from one place to another, for example by syndicating an image or moving an image through a Digital Asset Management system or Content Management System used to publish a website. (Of course, this requires that the embedded metadata is not stripped out by such tools.)
Created in a close collaboration with PLUS Coalition, the publication of the new properties comes after the conclusion of a public draft review period earlier this year. The properties are defined as part of the PLUS schema and incorporated into the IPTC Photo Metadata Standard in the same way that other properties such as Copyright Owner have been specified.
The new properties are now finalised and published. Specifically, the new properties are as follows:
- Data Mining: a field with a value from a controlled value vocabulary. Values come from the PLUS Data Mining vocabulary, reproduced here:
- http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED (Unspecified – no prohibition defined)
- http://ns.useplus.org/ldf/vocab/DMI-ALLOWED (Allowed)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING (Prohibited for AI/ML training)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-GENAIMLTRAINING (Prohibited for Generative AI/ML training)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-EXCEPTSEARCHENGINEINDEXING (Prohibited except for search engine indexing)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED (Prohibited)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT (Prohibited, see Other Constraints property)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR (Prohibited, see Embedded Encoded Rights Expression property)
- http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEELINKEDRIGHTSEXPR (Prohibited, see Linked Encoded Rights Expression property)
- Other Constraints: Also defined in the PLUS specification, this text property is to be used when the Data Mining property has the value “http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT“. It can specify, in a human-readable form, what other constraints may need to be followed to allow Data Mining, such as “Generative AI training is only allowed for academic purposes” etc.
The IPTC and PLUS Consortium wish to draw users attention to the following notice included in the specification:
Regional laws applying to an asset may prohibit, constrain, or allow data mining for certain purposes (such as search indexing or research), and may overrule the value selected for this property. Similarly, the absence of a prohibition does not indicate that the asset owner grants permission for data mining or any other use of an asset.
The prohibition “Prohibited except for search engine indexing” only permits data mining by search engines available to the public to identify the URL for an asset and its associated data (for the purpose of assisting the public in navigating to the URL for the asset), and prohibits all other uses, such as AI/ML training.
The IPTC encourages all photo metadata software vendors to incorporate the new properties into their tools as soon as possible, to support the needs of the photo industry.
ExifTool, the command-line tool for accessing and manipulating metadata in image files, already supports the new properties. Support was added in the ExifTool version 12.67 release, which is available for download on exiftool.org.
The new version of the specification can be accessed at https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata or from the navigation menu on iptc.org. The IPTC Get Photo Metadata tool and IPTC Photo Metadata Reference images been updated to use the new properties.
The IPTC and PLUS Coalition wish to thank many IPTC and PLUS member organisations and others who took part in the consultation process around these changes. For further information, please contact IPTC using the Contact Us form.
The IPTC News Codes Working Group has just released a new batch of changes to the IPTC NewsCodes family of controlled vocabularies.
Note that we skipped the Q2 update this year because there weren’t many changes, and also because there were already so many changes in Q1 of this year.
Media Topic changes
Here’s a summary of changes to Media Topic vocabulary:
- 2 new concepts: sustainability, profit sharing
- 3 retired concepts: justice, restructuring and recapitalisation, soft commodity
- 7 modified names (labels): restructuring and recapitalisation, soft commodity, study of law, sport shooting, sport organisation, recreational hiking and climbing, mountaineering, disabilities (German and Norwegian translations)
- 2 modified definitions: mountaineering, sport organisation
Change to Media Topic tree browser
We have made a small change to the Media Topic tree browser tool: we now display a small “i” icon next to the label name for terms that have notes defined.
The terms that have notes are usually retired terms, and the note gives the user information regarding which terms should be used instead of the retired term. But in other cases notes are used to help explain changes or clarify usage.
Changes to other vocabularies
Other vocabularies have also been updated:
- Content Production Party Role sees two new terms, contentEditor and metadataEditor, that can be used to show changes made by humans or systems (such as AI engines)
- Format had a small change to indicate that it is not just for NewsML 1 documents.
- User Action Type had a small bug fix, changed references to Twitter / X and retired Google Plus as a term. More changes will be coming soon covering other social media platforms and ways to track user interactions with media content.
- The rendition CV has been updated to make it more generic – renditions can apply to any type of media, not just images and video.
- The digitalsourcetype CV had already been updated in July to handle inpainting and outpainting but we mention it again here as a reminder.
Thanks to the representatives from IPTC members AFP, NTB, Bonnier News, ABC Australia, Bloomberg, New York Times and Associated Press for their contributions to the changes this quarter via the NewsCodes Working Group.
We are still working on our regular review of Media Topics – currently we are in the middle of a review of the Economy branch. The review is not yet complete but we hope for it to be ready for the Q4 or Q1 update.
NEW YORK, NY, 26 JULY 2023: The IPTC today announced the beginning of a public feedback and review period of IPTC Sport Schema, which aims to be “the standard for the next generation of sports data.”
The announcement was made by Paul Kelly, Lead of the IPTC Sports Content Working Group, at the Sports Video Group’s Content Management Forum held at 230 Fifth Penthouse, New York.
“The SVG Content Management Forum is attended by senior tech experts from sports broadcasters and sports leagues from the US and around the world, so it is the perfect place to launch the IPTC Sport Schema,” said Kelly. “Many members of SVG have advised us on our work so far, including organisations such as Warner Bros Discovery, NBC Universal, PGA TOUR, Major League Baseball and Riot Games. Presenting our work at their event is a great way to say thanks for their help.”
While not yet an official IPTC standard, the IPTC Sports Content Working Group feels that the schema describing IPTC Sport Schema is solid enough to be published for public feedback.
Sports data for the era of linked data and knowledge graphs
The purpose of the IPTC Sport Schema project is to create a new RDF-based sports data standard, while making the most of the experience the IPTC has gained from the last 20 years of maintaining SportsML, the open XML-based sports data standard used by news and sports organisations around the world.
While XML served the industry well for many years, more recently developers and IPTC members have asked the Sports Content Working Group whether a standard would become available in a more modern serialisation format such as JSON, and whether knowledge graph protocols would be supported.
Because it is based on the W3C-standard RDF and OWL specifications, IPTC Sport Schema leverages the wide range of tools and expertise in the world of knowledge graphs, semantic web and linked open data, including the SPARQL query language, the JSON-LD serialisation into JSON format, inference using RDF Schema and OWL, and more.
“Using IPTC Sport Schema, sports leagues can choose to own their data,” said IPTC Managing Director Brendan Quinn. “Content publishers or sports leagues can publish open data on their website if they choose, in a way that can be re-mixed and re-used by others around the world.” IPTC Sport Schema can also be used for a more traditional model of aggregation and syndication by sports statistics providers who add value to the raw data being collected by sports leagues.
Like its ancestor SportsML, IPTC Sport Schema is created as a generic sports data model that can represent results, statistics, schedules and rosters across many sports. “Plugins” for specific sports extend the generic schema with specific statistics elements for 10 sports such as soccer, motor racing, tennis, rugby and esports. But the generic model can be used to handle any competitive sports competition, either team-based, head-to-head or individual.
As well as IPTC’s SportsML standard, the project is based on previous work by the BBC on its BBC Sport Ontology (some of its creators worked on this project). We have also consulted with and analysed related projects and formats such as OpenTrack and the IOC’s Olympics Data Feed format.
For more information on IPTC Sport Schema, please see the dedicated site sportschema.org, the project’s GitHub repository,
Those who are interested in the details can see an introduction to the IPTC Sport Schema ontology design, the full ontology diagram or full RDF/OWL ontology documentation,
There may be significant changes to the schema between now and when it is released as a fully endorsed IPTC Standard, so we don’t recommend that it is implemented in production systems yet. But we welcome analysis and experimentation with the model, and look forward to seeing feedback from those who would like to implement it in the real world.
People and organisations who are not IPTC members can give feedback by posting to the IPTC SportsML public discussion group or use the IPTC Contact Us form.
The IPTC Standards Committee is happy to announce that ninjs, IPTC’s schema for marking up news content in JSON, has been revised to versions 2.1 and 1.5.
The vote to approve the new versions was taken at the recent IPTC Spring Meeting in Tallinn, Estonia and online.
This is in keeping with IPTC’s decision to maintain two parallel versions of ninjs: one for those who can’t upgrade to the 2.x version of backwards compatibility reasons, and those who prefer the simpler structure of ninjs 2.x that is easier to handle in some tools.
The ninjs User Guide has been updated to reflect the changes, which are summarised below.
ContactInfo added to ninjs 1.5 and 2.1
ninjs 2.1 and ninjs 1.5 both include the new contactinfo
structure which can be used in the people
, organisations
, places
and infosources
properties (and their ninjs 1.x equivalents person
, organisation
, place
and infosource
).
The contactInfo structure can contain physical or online contact information such as a street address or postal address, a username on social media such as Twitter, Instagram or TikTok, or even a locator such as what3words.
Here are some examples of how the contactinfo
property can be used:
"people": [ { "name": "Jonas Svensson", "contactinfo": [ { "type":"phone", "role": "work", "value": "+46 (0)8-7887500" } ] } ], "organisations": [ { "name": "International Committee of the Red Cross", "contactinfo": [ { "type": "web", "value": "https://www.icrc.org/" }, { "type": "address", "address": { "lines": [ "19 Avenue de la paix", "1202 Geneva", "Switzerland" ] } }, { "type": "telephone", "value": "+41 22 734 60 01" } ] } ]
Better support for organisation identifiers such as tickers, ISIN etc
ninjs 2.1 and 1.5 also include the new symboltype
and symbol
properties under symbols
. Symbol can identify any type of URI describing the type of the symbol. The CV http://cv.iptc.org/newscodes/financialinstrumentsymboltype is recommended.
The ticker
sub-property under symbols is now deprecated. This means that it can still be used if necessary, but use is not recommended.
We now recommend that ticker symbols are stored using symbol="TCKR"
and symboltype="https://cv.iptc.org/newscodes/financialinstrumentsymboltype/Ticker"
.
Better support for machine classification
The subjects
(ninjs 2.x) / subject
(ninjs 1.x) properties now allow for the sub-properties creator
, relevance
and confidence
.
This allows organisations to more accurately use machine-generated subject tags in their content. while stating that it was created by a machine (using the creator
property), and giving numerical values for the relevance
and confidence
scores that are reported by machine tagging engines. (Of course, these properties can also be used for human-created subject tags if necessary!)
In addition, some internal changes to the schema were made to fix a validation bug that existed in previous versions. In order to accommodate these changes, the ninjs 2.1 schema uses the https://json-schema.org/draft/2020-12/schema version of JSON Schema.
Thanks to Johan Lindgren, welcome Ian Young as Working Group Lead
At the Spring Meeting in Tallinn we said farewell to Johan Lindgren as Lead of the News in JSON Working Group.
Johan, of the TT news agency in Sweden, was instrumental in bringing the News in JSON Working Group back from its quiet period after the initial launch of ninjs. This directly led to the release of several new versions of ninjs over the past few years, and its adoption by many of the world’s top news providers.
The IPTC wishes to thank Johan for all his contributions, and wishes him well for his retirement.
Johan’s work will be taken over by Ian Young from PA Media Group / Alamy based in the UK. Ian steps up to the Lead role after participating in the Working Group for many years, since the earliest days of ninjs.
We thank Ian for being willing to take on the lead role, and we look forward to seeing what developments will emerge from the News in JSON Working Group in the future.