Categories
Archives
Recently conversations on Twitter and various blogs and news sites have reported on Facebook’s use of IPTC embedded photo metadata fields to “track users”. (Reddit.com: “Facebook is embedding tracking data inside the photos you download”, The Australian: “Facebook pics tracking you”, Forbes: “Facebook Embeds ‘Hidden Codes’ To Track Who Sees And Shares Your Photos”, Financial Express: “Beware! Facebook embeds tracking data inside photos you download”).
As the creators and maintainers of the IPTC Photo Metadata Standard, we want to clarify a few points and share our own analysis of the situation.
In Spring 2019, IPTC’s Photo Metadata Working Group conducted our latest round of tests regarding how various social media platforms deal with metadata embedded in uploaded and shared images. The 2019 test results show how Facebook treats image metadata: in IIM and EXIF formats, a few fields are retained related to claiming rights while all others are removed, and in the XMP format all fields are removed.
While this was a small improvement compared to the previous IPTC test in 2016 when all Exif fields were removed, we did not rate Facebook with a “green dot” showing compliance with IPTC standards, as removing metadata embedded by the owner of an image contradicts IPTC’s strong support for keeping metadata persistent.
In addition, in both the 2016 and 2019 tests the Working Group found that two fields in the IIM format do indeed appear to be given values populated by Facebook.
IPTC looks at the facts
IPTC provides a reference image for each version of its Photo Metadata Standard which contains a test value for every specified metadata field. This makes it easy to test which fields are removed or modified.
The reference image of the 2017.1 version of the standard was uploaded to Facebook by the Working Group member David Riecks and it can still be seen here. Next the group used the IPTC’s Get IPTC Photo Metadata website tool for retrieving embedded metadata of most of the images shown on the web. Anyone can use this tool: simply fill the URL of the image into the site’s form and click to see all the metadata embedded in the image.
This test was performed using the URL of the IPTC reference image uploaded to Facebook and the result was shown instantly:
- Embedded metadata fields in the IIM format related to rights were retained: Creator, Creator Job Title, Copyright Notice, Credit Line, Source and Description Writer.
- All embedded metadata using the XMP format were removed by Facebook.
- The Creator and the Copyright Notice in the Exif format were also retained.
- The Instructions field and the Job Id field in IIM show values significantly different from what had been uploaded. The IPTC Working Group assumes these values were inserted by Facebook:
- The value of the Instructions field starts with FBMD. The IPTC Working Group retrieved this image using “Save As…” and another Facebook user uploaded it to his account. Result: the value was not changed during the second upload to Facebook. These results were shown for the re-uploaded image.
- The value of the Job Id fields looks like a unique identifier. If an uploaded image is downloaded using the Save As function and then uploaded by another Facebook user this field contains a different value.
- The IPTC Working Group searched for any documentation of these inserted values but found no specification or statement from Facebook. There have been, however, many guesses and assumptions by users and developers.
Using the Get IPTC Photo Metadata site anybody can check what Facebook values were applied to her or his photo. As a user, you can find Facebook image URLs by clicking on the image on the Facebook site and using the “Copy image address” or the “Inspect” or “Inspect Element” function of your web browser, you should then see the URL.
IPTC’s summary
IPTC tests showed when a Facebook member uploads an image to the Facebook system it removes a lot of fields, keeps only a few related to rights and replaces or adds values to the Job Id and the Instructions fields. The role of these values is not publicly documented by Facebook, so they are currently the subject of significant speculation.
IPTC makes no assumptions about what the metadata values are used for, but Facebook appears to keep the value of the Instructions field constant even when the image is re-uploaded by another user. The Job ID field on the other hand changes with each separate upload.
Our recommendations are that all embedded metadata values should be retained by platforms and that no platform should be overwriting user metadata.
IPTC’s 2019 Social Media Platforms survey also looked at the metadata usage of other major social media platforms. Interested parties can find more information at Social Media Sites Photo Metadata Test Results 2019.
Technical notes
The example metadata values embedded into the 2017.1 reference image can be checked by going to https://getpmd.iptc.org and clicking on the green button in Option A labeled Get Photo Metadata of Web Image. No image URL is required, as by default the metadata of this reference image is retrieved and displayed.
For those interested in the technical details of embedded photo metadata, the technical formats IIM and XMP are introduced in the IPTC Photo Metadata User Guide, including a look under the hood of image files.
Home and away teams
alignment
attribute.Pre-game actions
<actions>
<action sequence-number="1" team-idref="team_9572" type="esacttype:remove" comment="Nuke"></action>
<action sequence-number="2" team-idref="team_6134" type="esacttype:remove" comment="Inferno"></action>
<action sequence-number="3" team-idref="team_9572" type="esacttype:choose" comment="Cache"></action>
<action sequence-number="4" team-idref="team_6134" type="esacttype:choose" comment="Train"></action>
<action sequence-number="5" team-idref="team_9572" type="esacttype:remove" comment="Overpass"></action>
<action sequence-number="6" team-idref="team_6134" type="esacttype:remove" comment="Dust2"></action>
<action sequence-number="7" type="esacttype:remaining" comment="Mirage"></action>
</actions>
Statistics for eSports teams, players and tournaments
scoping-label
on outcome-totals
in SportsML:<team-stats score="16" event-outcome="speventoutcome:win">
<outcome-totals scoping-label="T" wins="4" />
<outcome-totals scoping-label="CT" wins="12"/>
</team-stats>
<player-stats>
<rating rating-value="1.11"/>
<stats>
<stat stat-type="esstat:kills" value="15" />
<stat stat-type="esstat:headshot" value="6" />
<stat stat-type="esstat:assist" value="4" />
<stat stat-type="esstat:flashassist" value="2" />
<stat stat-type="esstat:deaths" value="11" />
<stat stat-type="esstat:KAST" value="78.3" />
<stat stat-type="esstat:ADR" value="68.4" />
<stat stat-type="esstat:FKdiff" value="0" />
</stats>
</player-stats>
stat
construction with stat-type
and value
we can handle any type of statistic.esstat:
and esacttype:
in these examples do not currently exist in the IPTC NewsCodes catalog but could easily be set up if needed. It might be necessary to have different prefixes for different type of eSports games. But that would require some more investigation.Last week’s 2019 IPTC Photo Metadata Conference was again hosted in association with the CEPIC Congress. This year’s conference was held in a slightly rainy Paris but at least that meant that we didn’t mind staying indoors in late May.
The event kicked off with an introduction from event chair Stéphane Guérillot from AFP, who is also on the Board of IPTC and Chair of the IPTC Standards Committee. The theme of the afternoon was “putting IPTC metadata to work for your image collections” and the emphasis on practical outcomes was a constant refrain.
The first panel was around the question of “do we still need IPTC Photo Metadata?” Michael Steidl, lead of the IPTC Photo Metadata Working Group started off by presenting results from the IPTC Photo Metadata surveys that the Working Group has undertaken earlier this year. Lúí Smyth from Shutterstock showed how metadata has helped them to organise millions of photos from thousands of sources. Isabelle Wirth, photo editor at AFP discussed how the agency uses IPTC Photo Metadata along with other IPTC standards such as News Codes and NewsML-G2 to make content searchable and shareable for their clients. And independent photographer and 3D photogrammetry expert with Deep3D, Simon Brown, explained how metadata was crucial for creating 3D views of sunken shipwrecks via tens of thousands of still photographs and some innovative software. In Simon’s words: “For more than one 3D project, projects with multiple contributors, or projects conducted over a longer period of time, IPTC entry becomes mandatory.”
The next session examined how creating and editing IPTC Photo Metadata could be improved. Sarah Saunders representing CEPIC presented results from the IPTC Photo Metadata surveys of both image suppliers and software makers showing that metadata usage has grown in sophistication but still varies greatly between independent photographers and large companies. Andrew Wiard, photographer and member of the British Press Photographers’ Association, spoke with passion about how we could improve the handling of photo metadata once it leaves the photographer’s desk, a constant goal of the Photo Metadata Working Group and which will form part of our work plan for the rest of 2019. Mayank Sagar from Image Data Systems showed some exciting tools with videos showing how their AI algorithms can detect objects from luggage and handbags for commuters to brands and logos on advertisements in sports footage, and talked about the current limits of AI classification and future issues such as how to handle artificially synthesised images. Andreas Gnutzmann of popular photo management software Fotoware showed how their system is moving to the cloud, putting metadata at its core even more than previously.
The third session looked at the end-user side and how the industry can benefit from photo metadata. Brendan Quinn of IPTC presented the Photo Metadata Crawler project, examining how news publishers around the world are embedding photo metadata in the images used on their sites. Michael Steidl showed results of the Photo Metadata Working Group’s updated analysis of social media systems and sharing platforms, which will be shared through an IPTC news article in the coming months. And Anna Dickson of Google gave us an update on her history working with images as photo editor at Huffington Post and Dow Jones among others, and discussing how Google are working with metadata and the IPTC, including our shared challenges of encouraging more site owners to publish embedded metadata so that it can be picked up by Google Search and other services. At the event, Google also announced some very interesting features that are currently in the pipeline.
Michael Steidl and Stéphane Guérillot closed out the event talking about the work the the IPTC Photo Metadata Working Group would be undertaking this year as a result of the discussions and of the survey results.
All slides from the day are available in PDF format from the event page, both to IPTC members and non-members.
Key findings from the Photo Metadata surveys will be shared in future news posts, so please watch this space for updates.
More information about the Google presentation and their proposed new features around image metadata is available to all IPTC members who have joined the Photo Metadata Working Group.
Thanks to all the speakers, to CEPIC for their assistance in hosting the conference, and to everyone who attended for making the event such a success!
At the IPTC Spring Meeting in Lisbon, the IPTC Standards Committee signed off on version 3.1 of SportsML.
Updates include:
round-number
attribute added tobaseEventMetadataComplexType
- Added
events-discarded
tooutcomeTotalsComplexType
andresult-status
tobase3StatsComplexType
to support events where players or teams can discard some of their results. - Fixed examples to use the correct qcodes
nprt:given
,nrol:short
etc for names - Corrected description of
distance
inactionAttributes
You can download the ZIP Package of SportsML 3.1 with XML Schemas and documentation included.
Development of SportsML is open to collaboration. Your feedback on the SportsML Users Forum is welcome!
We’re excited that the biggest week in the photo metadata calendar has arrived – the IPTC Photo Metadata Conference 2019 will be held in Paris this Thursday, 6 June.
We are looking forward to hearing from some IPTC members: Andreas Gnutzmann from Fotoware, Lúí Smyth from Shutterstock, Isabelle Wirth of Agence France Presse and Michael Steidl, Chair of the Photo Metadata Working Group and honourable member of IPTC. Stéphane Guerrilot, CEO of AFP Blue will be chairing the event.
We will also be hearing from independent photographer Andrew Wiard representing the British Press Photographer’s Association (BPPA), plus Anna Dickson, Visual Lead, Image Search at Google attend, bringing her expertise as one of Google’s experts on images but also with a history leading photography teams at Dow Jones and Huffington Post. Mayank Sagar from Image Data Systems will be speaking about the latest developments in automatic image tagging, and Simon Brown of Deep3D will look at the photographer’s view around embedding metadata.
Michael Steidl and Sarah Saunders will be presenting the results of the 2019 Photo Metadata Survey, where we have obtained the views of image creators, publishers and software makers regarding embedded image metadata.
Brendan Quinn, Managing Director of IPTC will be presenting the IPTC Photo Metadata Crawler which looks at usage of embedded photo metadata among news publishers.
We’re looking forward to analysing the world of photo metadata from the perspective of image creators and editors, software makers, publishers, search engines and end users.
There are still some tickets available, so please join us! Attendance is free for CEPIC Congress attendees, but if you just want to come for the IPTC event on Thursday afternoon you can register using this form for €100 + VAT.
See you there!
This post is part of a series about the IPTC Spring Meeting 2019 in Lisbon, Portugal. See day 1 writeup and the day 2 writeup.
Day 3 of the Lisbon meeting was all about metadata and controlled vocabularies, rights, and a look to the future of IPTC’s work plan.
We started with an update from Jennifer Parrucci, Senior Taxonomist at New York Times and lead of the IPTC NewsCodes Working Group, who gave an update of the group’s activities over the past six months. We have been focussing on updating our core subject taxonomy Media Topics, including updates to term labels and definitions, and also integrating and updating mappings to Wikidata entities that were kindly provided by Thad Guidry from the schema.org community.
Integrating Wikidata mappings was an interesting challenge as we didn’t always have good mappings, for example for “arts, culture, entertainment and media” there is no Wikidata entity that is broad enough to encompass all of those terms. But for the leaves of our tree, most terms had mappings, and for those that didn’t we will be suggesting new terms in Wikidata to accommodate them. We will also look at updating the mappings from Wikidata back to Media Topics now that we have updated the mappings in the other direction. Brendan Quinn presented some new tools used for managing NewsCodes internally, plus a new web tree browser view of Media Topics which will be launched very soon.
Translating Media Topics is another hot issue, with a recent contribution from the Swedish media that is now available as a Swedish language version of Media Topics. We have made it easier to find the language translations in the NewsCodes browser, and have also added some new terms that were suggested by the Swedish media consortium that will be using the new Swedish translation of Media Topics as their categorisation system for sharing content in the future. We realise that nearly ten years after moving from SubjectCodes to Media Topics as the standard IPTC subject classification, we still don’t support as many languages in Media Topics as we do in SubjectCodes so we want to make it as easy as possible to perform translations. Our discussion was based on the useful idea that anything with an existing translation in SubjectCodes can be directly taken into a Media Topics translation, and we can use the SubjectCode and Wikidata mappings to extract suggested term to get a translation team started. We have interest in creating Media Topics translations in Portuguese (for both Portugal and Brazil) and Chinese. If you are interested in helping with translations, please let us know.
Johan Lindgren from TT in Sweden spoke about the project that led to the Swedish translations and also discussed how they are approaching handling entities (names and organisations). This led to a wider discussion led by Stuart Myles of how to handle lists of entities and whether IPTC should be working on a standard or a best practice document in that area. We also discussed the idea of a taxonomy for describing images in a stylistic way (such as “happy”, “blue”, or “outdoors”) as opposed to describing the content. Such a standardised controlled vocabulary could be useful to image libraries and AI classification engines. This is an area of active work for us and more information will be available in the coming months. If you want to help, talk to us!
Invited guest Carlos Amaral from local company Priberam demonstrated their text mining and visualisation system created in partnership with Deutsche Welle and other broadcasters for use in browsing stories according to subject, image, extracted entities and keywords.
Stéphane Guérillot from AFP presented his new API for retrieving news content, which led to more discussion of whether IPTC should be standardising an API that could be used by multiple news providers to share their content.
Michael Steidl spoke on RightsML and Blaise Galinier from BBC talked about their current project looking at viewing news content based on rights. Two key insights from Blaise’s talk: Firstly, any demonstration of what is or isn’t usable is always based on the particular user and the context in which they want to use a piece of media. Also, it’s not enough to show a journalist what they can and can’t use; they need to know why a piece of content is “red” “green” or “amber”.
Everyone had a great time at this year’s 2019 Spring Meeting, we’re already planning the next one in Ljubljana, Slovenia in October. Members: please save the dates 14 – 16 October 2019. If you’re not a member but you would like to present at the meeting, please get in touch!
This post is part of a series about the IPTC Spring Meeting 2019 in Lisbon, Portugal. See day 1 writeup and the day 3 writeup.
Tuesday was our biggest day in terms of content and also in terms of people! We had 40 people in the meeting room which was a tight squeeze, thanks to everyone for your understanding!
The topic focus for Day 2 was Photo and Video, so it was natural that the day was kicked off by Michael Steidl, lead of the IPTC Photo and Video Working Groups. As we had a lot of new members and new attendees in the audience, Michael gave an overview of how IPTC Photo Metadata has come to where it is today, used by almost all photography providers and even used in Google Image Search results (see our post from last year on that subject). The Photo Metadata Working Group is currently conducting a survey of Photo Metadata usage across publishers, photo suppliers (such as stock photo agencies and news wires), and software makers. Michael gave a quick preview of some of the results but we won’t spoil anything here, you will have to wait for the full results to be revealed at the 2019 IPTC Photo Metadata Conference in Paris this June. Brendan Quinn also presented a status report on the IPTC Photo Metadata Crawler which examines usage of IPTC Photo Metadata fields at news providers around the world. This will also be revealed at the Photo Metadata Conference.
Next, invited visitors Ilkka Järstä and Marina Ekroos from Frameright presented their solution to the problem of cropping images for different outlets, for example all of the different sizes required for various social media. They embed the crop regions using embedded metadata which is of great interest to the Photo Metadata Working Group, as we are looking at various options for allowing region-based metadata to cover not only an image as a whole but a region within an image, in a standardised way.
We had a workshop / discussion session on the recently ratified EU Copyright Directive which will impact all media companies in the next two years. Voted through by the European Parliament this month after intense lobbying from both sides, it could easily be bigger than GDPR, so it’s important for media outlets around the world. Discussion included how and whether IPTC standards could be used to help companies comply with the law. No doubt we will be hearing more about this in the future.
Michael then presented the Video Metadata Working Group‘s status report, including promotional activities at conferences and investigations to see what use cases we can gather from various users of video metadata amongst our members and in the wider media industry.
Then Abdul Hakim from DPP showed a practical use of video metadata in the DPP Metadata for News Exchange initiative which is based on NewsML-G2. An end-to-end demonstration of metadata being carried through from shot planning through the production process all the way to distribution via Reuters Connect. See our blog post about the Metadata for News Exchange project for more details.
Then Andy Read from BBC presented the BBC’s “Data flow for News” project, taking the principles of metadata being carried through the newsroom along with the content, looking at how to track the cost of production of each item of content and also its “audience value” across platforms to calculate a return on investment figure for all types of content. Iain Smith showed the other side of this project via a live demonstration of the BBC’s newsroom audience measurement system.
After lunch, Gan Lu and Kitty Lan from new IPTC member Yuanben presented their approach to rights protection using blockchain technology. Yuanben run a blockchain-based image registry plus a scanner that detects copyright infringements on the web. Using blockchain as proof of existence has been around for a while but it’s great to see it being used in such a practical context, very relevant for the media industry.
Lastly, another new member Shutterstock was represented by Lúí Smyth who gave us an overview of Shutterstock’s current projects relating to large-scale image management: they have over 260 million images, with over 1 million images added each week! Shutterstock are using the opportunity of refreshing their systems to re-align with IPTC standards and to learn what their suppliers, partners and distributors expect, and we look forward to helping them tackle shared challenges together.
This post is part of a series about the IPTC Spring Meeting 2019 in Lisbon, Portugal. See Day 2 writeup and the day 3 writeup.
Last week brought IPTC members together for our twice-yearly Face-to-Face Meeting to discuss news credibility, taxonomies and controlled vocabularies, updates in sports standards and much more!
This year’s IPTC Spring Meeting was in Lisbon, Portugal, and over 40 IPTC member delegates, member experts and invited guests gathered for three days to discuss all the latest developments in news and media technology.
On Monday, IPTC Chair and Director of Information Management for Associated Press Stuart Myles gave a great introduction and overview of what was to come in the meeting. After everyone introduced themselves, Stuart discussed some changes that the IPTC Board has been thinking about, including looking at updating the Mission and Vision of the organisation to reflect how we operate in 2019.
Then Robert Schmidt-Nia from dpa Deutsche Presse-Agentur introduced their C-POP project (in collaboration with STT and the Sanoma group in Finland) which follows on from the Performing Content we saw at the previous meeting in Toronto. It was interesting hearing about the agency’s shift in focus from a strict business-to-business model to a “B2B2C” model thinking about what consumers needed and how agencies could help publishers to deliver on the needs of readers and subscribers, ideally using feedback from publishers to agencies on how well their content is performing according to real metrics like loyalty and subscription revenue. IPTC will be involved in the C-POP project so you can expect to hear more about this in the future.
On the same topic, Andy Read from BBC gave an overview of the “Telescope” internal measurement tool, showing how BBC staff can view in real time how their content is being consumed by region, topic or device.
James Logan from the BBC and Brendan Quinn of IPTC gave an overview of IPTC’s work with news trust and credibility projects The Trust Project and the Journalism Trust Initiative. We decided at the Autumn 2018 Meeting that IPTC wouldn’t create its own standard around news credibility, disinformation and “fake news”, but that we would work with existing groups and help them to incorporate their standards in IPTC’s work. With The Trust Project, that has been going well, and we are almost ready to publish some best practices on implementing the Trust Project’s Trust Indicators in NewsML-G2 content. Trust Project indicators are already used in schema.org markup by over 120 news providers so it’s great to see such strong uptake.
Separately we have been working with Reporters Sans Frontières’ Journalism Trust Initiative which is at an earlier stage and is looking at documenting general standards for trustworthy and ethical journalism. IPTC is part of the JTI’s Technical Task Force which is working with the drafting teams on making their statements specific enough to be answered with data and indexed by machines. Hopefully it will end up with similar indicators to the Trust Project indicators
With both news credibility projects, some questions still need to be addressed, such as assessing the credibility of claims (when a news organisation says they are trustworthy, how can you trust them!), and how these trust indicators work in a multi-provider workflow: if a news agency sends some content to a publisher who then merges it with original reportage, who determines the trust indicators that are attached to the final story? There is definitely a lot more work to do!
On the same topic, Dave Compton of Refinitiv gave an update on how the News Architecture Working Group has been looking at the Trust Project’s Trust Indicators and working them into NewsML-G2. As far as we have seen so far, no updates to the NewsML-G2 standard are necessary to support the new work. Martin Vertel from dpa showed us the API he created to give dpa’s clients access to Trust Project indicators for dpa stories. Building it with a browser-based JavaScript module opens up some interesting possibilities.
Joaquim Carreira from local agency Lusa showed us the “Combate Às Fake News” project focussing on media literacy and helping readers to know what to look for, including the idea of a “nutrition label” for news content looking at criteria such as factuality, readability and use of emotional language.
The day was rounded off with Johan Lindgren of Swedish agency TT presenting the recent work of IPTC’s Sports Content Working Group. The group has recently been tidying up the spec and incorporating suggestions for changes, plus looking at eSports and Chess as two non-traditional sports that are both seeing an increase in interest – in the case of eSports, it is becoming a huge industry. Our tests showed that in simple cases eSports results can be addressed with existing SportsML 3 structures, but to handle more detailed play-by-play results we may need to at least introduce a new controlled vocabulary. Please let us know if you would like to implement SportsML for eSports!
Johan also presented the draft of SportsML 3.1 to be voted on by the IPTC Standards Committee.
Stay tuned for an update on Days 2 and 3!
We were proud to be involved at last week’s Metadata Exchange for News interoperability demo organised by DPP (formerly known as the Digital Production Partnership).
DPP’s “Metadata Exchange for News” is an industry initiative aimed at making the news production process easier.
The DPP team looked around for existing standards on which to base their work, and when they found IPTC’s NewsML-G2, they realised that it exactly matched their requirements. NewsML-G2’s generic PlanningItem and NewsItem structure meant that it could easily be used to manage news production workflows with no customisation required.
We were treated to a demo of a full news production workflow in the DPP’s offices at ITV in London on February 6th.
A full news production workflow
As you can see from the diagram, the workflow involves these steps:
- An editor creates a planning record for a news item using Wolftech’s planning system, describing metadata for the planned story
- The system sends the planning item as NewsML-G2 to Sony’s XDCAM Air system which converts it to Sony’s proprietary planning metadata and sends it directly to a camera
- XDCAM Air retrieves the footage from the camera, links it to the planning metadata using the NewsML-G2 IDs, back into XDCAM Air which is then retrieved by some simple custom web services
- The web services send NewsML-G2 NewsItem metadata along with the MP4 video file to Ooyala’s Flex Media Platform via an Amazon Web Services S3 bucket
- Ooyala Flex Media Platform sends the media and metadata to the platforms that require it, in this case the Reuters Connect video browsing and distribution platform.
The NewsML-G2 integrations were built for the demo but the idea is that they will soon become standard features of the products involved. All parties reported that implementing NewsML-G2 was fast and fairly painless!
Thanks to all involved and special thanks to Abdul Hakim of DPP for leading the project and organising the demo day.
Look out for an IPTC Webinar on this topic soon!
Thanks to everyone who attended our first webinar on Thursday, with Brendan Quinn providing an introduction to IPTC, explaining what we do, where we have come from and where we are going.
For those who missed it, you can view it on demand by registering via this site.
Please let us know what you thought! Your feedback is always welcome, and we would particularly like to hear ideas for future webinars.
All feedback can be sent to office@iptc.org.