At the IPTC Autumn Meeting, the IPTC Standards Committee voted on a change proposed by the Photo Metadata Working Group, which created version 2024.1 of the IPTC Photo Metadata Standard.

The change is minor but important to some: the definition of the Keywords property now includes the following text:

Keywords to express the subject and other aspects of the content of the image. Keywords may be free text and don’t have to be taken from a controlled vocabulary. Codes from the controlled vocabulary IPTC Subject NewsCodes must go to the “Subject Code” field.

This aligns the property definition with the way in which many photo agencies and photographers were already using the field: to convey aspects such as the lighting or lens effects used, “mood” of the image, dominant colour and more.

We give examples of how the Keywords property may be used in the IPTC Photo Metadata User Guide.

The relevant files have all been updated for the new version:

We thank Agence France-Presse for their help in offering examples for how the Keywords property may be used.

 

The IPTC has worked together with the DPP and stakeholders from Reuters, Arqiva and Warner Brothers Discovery to develop a pioneering new initiative called DPP Live Production Exchange (LPX). The LPX protocol covers API and a data schema for information related to news coverage of live events, including the ability for B2B event subscribers to be informed about upcoming news events and their coverage.

IPTC’s contribution to the project was to enhance and evolve our ninjs (News in JSON) standard to support news coverage of events and  live streamed content. The News in JSON Working Group dedicated a lot of its time to this work over the past two years, including participating in the DPP LPX Hackathon in Spring 2024.

Screenshot of the DPP LPX website JSON Schema page

The underlying data model for the events and planning work in ninjs comes from the IPTC News Architecture and is based on EventsML-G2, a part of the NewsML-G2 family of standards which was created over 10 years ago.

Ian Young of PA Media, Lead of the IPTC News in JSON Working Group, said “Basing the work of ninjs 3.0 on the stable foundation of the IPTC News Architecture made our work much simpler. IPTC members have been syndicating news events for years using this model so we know that it works. That meant that we could focus on making ninjs 3.0 handle live events and streaming video in a way that is practical and simple, both for developers and for users.”

IPTC Managing Director Brendan Quinn said “we owe our thanks to our teammates and partners on this project: David Thompson from IPTC liaison partners the DPP, JJ Eynon from CNN / Warner Brothers Discovery, Tania Vivero and Ian McLaren from Reuters (IPTC Voting Member), and Daniel Lynch from Arqiva (IPTC Associate Member). Through a very friendly and collegial but also productive and results-driven collaboration, we have arrived at a solution that should make syndicated news events much easier to handle in all newsroom workflows.”

ninjs 3.0 and the LPX API are or will soon be supported by tools from Arqiva, Reuters and Wolftech (who were recently acquired by Avid). We hope that many more implementations will be emerge in the coming months.

For more on ninjs 3.0, see the following resources:

An extract of IPTC Media Topics vocabulary tree browser showing the new "show retired" button.
An extract of IPTC Media Topics vocabulary tree browser.

The IPTC NewsCodes Working Group is pleased to announce the latest release of the IPTC NewsCodes, our set of controlled vocabularies for the news industry.

Updates this time span many vocabularies, with the biggest updates to Media Topic and Digital Source Type.

Media Topic updates

Most of the recent work has been in the politics branch.

3 new concepts: by-election, recall election, coalition building

2 retired concepts: political campaigns, church elections

4 modified concept names (in English): voting system, referendum, fundamental rights, football (yes we finally refer to the sport as “football” in en-GB and “soccer” in en-US!)

Modified concept definitions: 22 civil rights, election, voting system, intergovernmental elections, local elections, primary elections, referendum, regional elections, voting, fundamental rights, censorship and freedom of speech, freedom of religion, freedom of the press, human rights, football, political debates, privacy, women’s rights, breaking (breakdance)

1 hierarchy move: fundamental rights has been moved from politics to society.

Also, the Wikidata mapping URIs have all been changed to point to the http:// version of the URI instead of the https:// version. This follows the official Wikidata guidance.

See the official Media Topic vocabulary on the IPTC Controlled Vocabulary server, and an easier-to-navigate tree view. An Excel version of IPTC Media Topics is also available.

Digital Source Type updates

5 new concepts have been added:

2 concepts have been retired: Original media with minor human edits, and Digital art, as explained above.

8 concepts have had their names and definitions modified, while retaining the same machine-readable ID for backwards-compatibility purposes:

Our thanks go to IPTC representatives and experts from Partnership on AI, Google, Adobe, C2PA, CIPA and many others on making these updates to our vocabulary, which is now widely used to identify Generative AI content.

Updates to other NewsCodes vocabularies

Alternative  Identifier Role (altidrole)

  • Vocabulary’s name changed to fix a spelling mistake.
  • New concept: IPTC Video Metadata Hub ID (altidrole:vmhVideoId)

Event Occur Status (eocstat)

  • Fix spelling mistake “occurence” -> “occurrence” throughout.

Golf Shot (spgolshot)

Rights Property (rightsprop)

Sports Concept (spct)

The IPTC NewsCodes Working Group has released the latest update to IPTC NewsCodes vocabularies.

The changes are quite minor this time, but we still recommend that users stay up to date with the latest version.

Changes to Media Topics vocabulary

Our main subject classification taxonomy, IPTC Media Topics, has seen the following updates:

1 new concept

1 retired concept

32 modified definitions

These changes mostly correct spelling errors in en-GB where US spellings had slipped in, such as changing “behavior” to “behaviour” for en-GB:

wireless technology, tobacco and nicotine, economic trends and indicators, international economic institution, stocks and securities, adult and continuing education, upper secondary education, social learning, medical condition, Confucianism, relations between religion and government, road cycling, competitive dancing, sexual misconduct, developmental disorder, fraternal and community group, cyber warfare, public transport, taxi and ride-hailing, shared transport, business reporting and performance, business restructuring, commercial real estate, residential real estate, podcast, financial service, business service, news industry, diversity, equity and inclusion, sustainability, profit sharing, breaking (breakdance).

As usual, the Media Topics vocabularies can be viewed in the following ways:

Updates to other vocabularies

Horse Position (sphorposition)

New term “trainer” added to https://cv.iptc.org/newscodes/sphorposition. This term is needed by IPTC Sport Schema.

 

For more information on IPTC NewsCodes in general, please see the IPTC NewsCodes Guidelines.

The IPTC News Architecture Working Group is happy to announce the release of NewsML-G2 version 2.34.

This version, approved at the IPTC Standards Committee Meeting at the New York Times offices on Wednesday 17th April 2024, contains one small change and one additional feature:

Change Request 218, increase nesting of <related> tags: this allows for <related> items to contain child <related> items, up to three levels of nesting. This can be applied to many NewsML-G2 elements:

  • pubHistory/published
  • QualRelPropType (used in itemClass, action)
  • schemeMeta
  • ConceptRelationshipsGroup (used in concept, event, Flex1PropType, Flex1RolePropType, FlexPersonPropType, FlexOrganisationPropType, FlexGeoAreaPropType, FlexPOIPropType, FlexPartyPropType, FlexLocationPropType)

Note that we chose not to allow for recursive nesting because this caused problems with some XML code generators and XML editors.

Change Request 219, add dataMining element to rightsinfo: In accordance with other IPTC standards such as the IPTC Photo Metadata Standard and Video Metadata Hub, we have now added a new element to the <rightsInfo> block to convey a content owner’s wishes in terms of data mining of the content. We recommend the use of the PLUS Vocabulary that is also recommended for the other IPTC standards: https://ns.useplus.org/LDF/ldf-XMPSpecification#DataMining

Here are some examples of its use:

Denying all Generative AI / Machine Learning training using this content:

<rightsInfo>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING"/>
</rightsInfo>

A simple text-based constraint:

<rightsInfo>
  <usageTerms>
    Data mining allowed for academic and research purposes only.
  </usageTerms>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

A simple text based constraint, expressed using a QCode instead of a URI:

<rightsInfo>
  <usageTerms>
    Reprint rights excluded.
  </usageTerms>
  <dataMining qcode="plusvocab:DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

A text-based constraint expressed in both English and French:

<rightsInfo>
  <usageTerms xml:lang="en">
    Reprint rights excluded.
  </usageTerms>
  <usageTerms xml:lang="fr">
    droits de réimpression exclus
  </usageTerms>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT" />
</rightsInfo>

Using the “see embedded rights expression” constraint to express a complex machine-readable rights expression in RightsML:

<rightsInfo>
  <rightsExpressionXML langid="http://www.w3.org/ns/odrl/2/">
    <!-- RightsML goes here... -->
  </rightsExpressionXML>
  <dataMining uri="http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR"/>>
</rightsInfo>

For more information, contact the IPTC News Architecture Working Group via the public NewsML-G2 mailing list.

Paul Kelly speaking at the Sports Video Group's Content Management Forum in July 2023
Paul Kelly speaking about IPTC Sport Schema at the Sports Video Group’s Content Management Forum in New York, July 2023.

The IPTC Sports Content Working Group is happy to announce the release of IPTC Sport Schema version 1.0.

The first new IPTC standard to be released in more than 10 years, IPTC Sport Schema is a comprehensive model for the storage, transmission and querying of sports data. It has been tested on real-world use cases that are common in any newsroom or sports organisation.

IPTC Sport Schema has evolved from its predecessor SportsML. In contrast to the document-oriented nature of SportsML, IPTC Sport Schema takes a data-centric approach which is better suited to systems dealing with large volumes of data and also helps with integration across data sets.

“We reached out to many companies dealing with sports content and built up a clear picture of their needs,” says IPTC Sports Content Working Group lead Paul Kelly. “They wanted up-to-date formats, easy querying, the ability to handle e-sports and the ability to cross-reference between different media and data silos. IPTC Sport Schema addresses those requirements with a new basic model at the abstract end, and adhering to common use cases to keep things grounded.”

Content in Sports Schema is represented in the W3C’s universal Resource Description Framework (RDF), which renders any kind of data as a triple in the form of subject->predicate->object. Each component of a Sports Schema triple has a reference to an ontology, which defines the model at the heart of the standard. Querying is done using the W3C’s SPARQL standard, a kind of SQL for RDF.

Schema diagram for IPTC Sport Schema, showing the entities and the relationships between them.
The schema diagram for IPTC Sport Schema, showing the entities and the relationships between them. For more information see www.sportschema.org.

“The IPTC has been working on RDF and semantic web standards for more than 10 years, going back to rNews and RightsML,” said IPTC Managing Director Brendan Quinn. “So we are very happy to release another semantic standard that can help organisations to publish and share sports data in a vendor-neutral, interoperable way.”

Being RDF-based, IPTC Sport Schema can be rendered in XML, JSON and the simple Turtle format, and can be converted easily between all three formats using free tools such as Apache Jena.

“Those familiar with SportsML or SportsJS should recognise the basic components of Sport Schema,” says Kelly, “both in the ontology and in the sports vocabularies introduced with SportsML 3.0, which were designed specifically with semantic technologies in mind.”

To support take-up and share information about the new standard, the IPTC has created a dedicated website, sportschema.org. The site contains:

Those wishing to try out some SPARQL queries against some sports data should visit Sport Schema’s query endpoint. It includes example queries showing how to build a team roster, league standings and more from our sample data sets.

For more information on IPTC Sport Schema, see the IPTC’s landing pages on the IPTC Sport Schema standard, the standalone site sportschema.org, or the project’s GitHub repository.

If you are interested in joining those who are working on implementing IPTC Sport Schema in your project or your organisation, we would love to hear from you. Please contact us via IPTC’s contact form.

"A reel of film unspooling and transforming into a stream of binary digits"
Made with Bing Image Creator. Powered by DALL-E.
“A reel of film unspooling and transforming into a stream of binary digits”
Made with Bing Image Creator. Powered by DALL-E.

Following the IPTC’s recent announcement that Rights holders can exclude images from generative AI with IPTC Photo Metadata Standard 2023.1 , the IPTC Video Metadata Working Group  is very happy to announce that the same capability now exists for video, through IPTC Video Metadata Hub version 1.5.

The “Data Mining” property has been added to this new version of IPTC Video Metadata Hub, which was approved by the IPTC Standards Committee on October 4th, 2023. Because it uses the same XMP identifier as the Photo Metadata Standard property, the existing support in the latest versions of ExifTool will also work for video files.

Therefore, adding metadata to a video file that says it should be excluded from Generative AI indexing is as simple as running this command in a terminal window:

exiftool -XMP-plus:DataMining="Prohibited for Generative AI/ML training" example-video.mp4

(Please note that this will only work in ExifTool version 12.67 and above, i.e. any version of ExifTool released after September 19, 2023)

The possible values of the Data Mining property are listed below:

PLUS URI Description (use exactly this text with ExifTool)

http://ns.useplus.org/ldf/vocab/DMI-UNSPECIFIED

Unspecified – no prohibition defined

http://ns.useplus.org/ldf/vocab/DMI-ALLOWED (Allowed)

Allowed

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-AIMLTRAINING

Prohibited for AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-GENAIMLTRAINING

Prohibited for Generative AI/ML training

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-EXCEPTSEARCHENGINEINDEXING

Prohibited except for search engine indexing

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED

Prohibited

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEECONSTRAINT

Prohibited, see plus:OtherConstraints

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEEEMBEDDEDRIGHTSEXPR

Prohibited, see iptcExt:EmbdEncRightsExpr

http://ns.useplus.org/ldf/vocab/DMI-PROHIBITED-SEELINKEDRIGHTSEXPR

Prohibited, see iptcExt:LinkedEncRightsExpr

A corresponding new property “Other Constraints” has also been added to Video Metadata Hub v1.5. This property allows plain-text human-readable constraints to be placed on the video when using the “Prohibited, see plus:OtherConstraints” value of the Data Mining property.

The Video Metadata Hub User Guide and Video Metadata Hub Generator have also been updated to include the new Data Mining property added in version 1.5.

We look forward to seeing video tools (and particularly crawling engines for generative AI training systems) implement the new properties.

Please feel free to discuss the new version of Video Metadata Hub on the public iptc-videometadata discussion group, or contact IPTC via the Contact us form.

The IPTC NewsML-G2 Working Group and the News Architecture Working Group are happy to announce the release of the latest version of our flagship XML-based news syndication standard: NewsML-G2 v2.33.

Changes in the latest version are small but significant. We have added support for the Digital Source Type property which is already being used in IPTC’s sister standards IPTC Photo Metadata Standard and IPTC Video Metadata Hub and ninjs. This property can be used to declare when content has been created or modified by software, including by Generative AI engines.

Examples of other possible values for the digital source type property using the recommended IPTC Digital Source Type NewsCodes vocabulary are:

ID (in QCode format) Name Example
digsrctype:digitalCapture Original digital capture sampled from real life:

The digital media is captured from a real-life source using a digital camera or digital recording device

Digital video taken using a digital film, video or smartphone camera

digsrctype:negativeFilm Digitised from a negative on film:

The digital image was digitised from a negative on film on any other transparent medium

Digital photo scanned from a photographic negative

digsrctype:minorHumanEdits Original media with minor human edits:

Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine

Original audio with minor edits (e.g. to eliminate breaks)

digsrctype:algorithmicallyEnhanced Algorithmic enhancement:
Minor augmentation or correction by algorithm

A photo that has been digitally enhanced using a mechanism such as Google Photos’ “denoise” feature

digsrctype:dataDrivenMedia Data-driven media:
Digital media representation of data via human programming or creativity

Textual weather report generated by code using readings from weather detection instruments

digsrctype:trainedAlgorithmicMedia Trained algorithmic media:
Digital media created algorithmically using a model derived from sampled content

A “deepfake” video using a combination of a real actor and a trained model

 

The above list is a subset of the full list of recommended values. See the full IPTC Digital Source Type NewsCodes vocabulary for the complete list.

Guidance on using Digital Source Type

The IPTC Photo Metadata User Guide contains a section on Guidance for using Digital Source Type including examples for various types of media, including images, video, audio and text. The examples referenced in this guide can also apply to NewsML-G2 content.

Where Digital Source Type can be used in NewsML-G2 documents

The new <digitalSourceType> property can be added to the contentMeta section of any G2 NewsItem, PackageItem, KnowledgeItem, ConceptItem or PlanningItem to describe the digital source type of an item in its entirety.

It can also be used in the partMeta section of any G2 NewsItem, PackageItem or KnowledgeItem to describe the digital source type of a part of the item. In this way, content such as a video that includes some captured shots and AI-generated shots can be fully described using NewsML-G2.

Find out more about NewsML-G2 v2.33

All information related to NewsML-G2 2.33 is at https://iptc.org/std/NewsML-G2/2.33/.

The NewsML-G2 Specification document has been updated to cover the new version 2.33.

Example instance documents are at https://iptc.org/std/NewsML-G2/2.33/examples/

Full XML Schema documentation is located at https://iptc.org/std/NewsML-G2/2.33/specification/XML-Schema-Doc-Power/

XML source documents and unit tests are hosted in the public NewsML-G2 GitHub repository.

The NewsML-G2 Generator tool has also been updated to produce NewsML-G2 2.33 files using the version 38 catalog.

For any questions or comments, please contact us via the IPTC Contact Us form or post to the iptc-newsml-g2@groups.io mailing list. IPTC members can ask questions at the weekly IPTC News Architecture Working Group meetings.

Dynamic fountains out of the Drau river in Villach, Carinthia, Austria (Europe). This image contains the new Data Mining property. Clicking on the image will show the metadata as extracted by IPTC’s online Get Photo Metadata tool.

Updated in June 2024 to include an image containing the new metadata property

Many image rights owners noticed that their assets were being used as training data for generative AI image creators, and asked the IPTC for a way to express that such use is prohibited. The new version 2023.1 of the IPTC Photo Metadata Standard now provides means to do this: a field named “Data Mining” and a standardised list of values, adopted from the PLUS Coalition. These values can show that data mining is prohibited or allowed either in general, for AI or Machine Learning purposes or for generative AI/ML purposes. The standard was approved by IPTC members on 4th October 2023 and the specifications are now publicly available.

Because these data fields, like all IPTC Photo Metadata, are embedded in the file itself, the information will be retained even after an image is moved from one place to another, for example by syndicating an image or moving an image through a Digital Asset Management system or Content Management System used to publish a website. (Of course, this requires that the embedded metadata is not stripped out by such tools.)

Created in a close collaboration with PLUS Coalition, the publication of the new properties comes after the conclusion of a public draft review period earlier this year. The properties are defined as part of the PLUS schema and incorporated into the IPTC Photo Metadata Standard in the same way that other properties such as Copyright Owner have been specified.

The new properties are now finalised and published. Specifically, the new properties are as follows:

The IPTC and PLUS Consortium wish to draw users attention to the following notice included in the specification:

Regional laws applying to an asset may prohibit, constrain, or allow data mining for certain purposes (such as search indexing or research), and may overrule the value selected for this property. Similarly, the absence of a prohibition does not indicate that the asset owner grants permission for data mining or any other use of an asset.

The prohibition “Prohibited except for search engine indexing” only permits data mining by search engines available to the public to identify the URL for an asset and its associated data (for the purpose of assisting the public in navigating to the URL for the asset), and prohibits all other uses, such as AI/ML training.

The IPTC encourages all photo metadata software vendors to incorporate the new properties into their tools as soon as possible, to support the needs of the photo industry.

ExifTool, the command-line tool for accessing and manipulating metadata in image files, already supports the new properties. Support was added in the ExifTool version 12.67 release, which is available for download on exiftool.org.

The new version of the specification can be accessed at https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata or from the navigation menu on iptc.org. The IPTC Get Photo Metadata tool and IPTC Photo Metadata Reference images been updated to use the new properties.

The IPTC and PLUS Coalition wish to thank many IPTC and PLUS member organisations and others who took part in the consultation process around these changes. For further information, please contact IPTC using the Contact Us form.

Screenshot of the change to the Media Topic tree browser tool, showing information icons where terms have had notes added.
Screenshot of the change to the Media Topic tree browser tool, showing information icons where terms have had notes added.

The IPTC News Codes Working Group has just released a new batch of changes to the IPTC NewsCodes family of controlled vocabularies.

Note that we skipped the Q2 update this year because there weren’t many changes, and also because there were already so many changes in Q1 of this year.

Media Topic changes

Here’s a summary of changes to Media Topic vocabulary:

Change to Media Topic tree browser

We have made a small change to the Media Topic tree browser tool: we now display a small “i” icon next to the label name for terms that have notes defined.

The terms that have notes are usually retired terms, and the note gives the user information regarding which terms should be used instead of the retired term. But in other cases notes are used to help explain changes or clarify usage.

Changes to other vocabularies

Other vocabularies have also been updated:

  • Content Production Party Role sees two new terms, contentEditor and metadataEditor, that can be used to show changes made by humans or systems (such as AI engines)
  • Format had a small change to indicate that it is not just for NewsML 1 documents.
  • User Action Type had a small bug fix, changed references to Twitter / X and retired Google Plus as a term. More changes will be coming soon covering other social media platforms and ways to track user interactions with media content.
  • The rendition CV has been updated to make it more generic – renditions can apply to any type of media, not just images and video.
  • The digitalsourcetype CV had already been updated in July to handle inpainting and outpainting but we mention it again here as a reminder.

Thanks to the representatives from IPTC members AFP, NTB, Bonnier News, ABC Australia, Bloomberg, New York Times and Associated Press for their contributions to the changes this quarter via the NewsCodes Working Group.

We are still working on our regular review of Media Topics – currently we are in the middle of a review of the Economy branch. The review is not yet complete but we hope for it to be ready for the Q4 or Q1 update.