Web Annotation

An overview of Web Annotation. Source: W3C 2018b.
An overview of Web Annotation. Source: W3C 2018b.

In traditional print media, it's common practice among readers to underline text, highlight sections or write comments in the margins. Such annotations allow readers to express their opinions. Web Annotation is a standard for creating similar annotations on the web for digital content. It's standardized by the W3C Web Annotation Working Group.

Since the early years of the Web, there have been many proprietary systems to enable online annotations. Web Annotation offers a standardized data model, vocabulary, protocol and embeddings to create, organize and share annotations. Web Annotation allows anyone to express their opinions freely without being censored by content authors or publishers. It reuses concepts and tools from Semantic Web.

Discussion

  • Without Web Annotation, how do people annotate content on the web?

    Web content involves authors, content publishers and readers. To engage readers, content publishers enable comments on their site. However, they could moderate the comments and even delete them if they didn't like them. This control from publishers meant that the system was not truly open from a reader's perspective. In fact, some publishers don't even allow comments.

    Published comments often appear at the end of the article and therefore remain disconnected from a particular line or paragraph to which the comment applies. There's also no distinction between a valuable comment and frivolous chit-chat. Readers also have no way to archive their comments or take their comments from one platform to another. Readers may be identified only loosely and there's no reputation model to say who's a more valuable commenter. The lack of a reputation model was one reason why project Dispute Finder failed.

    Before Web Annotation, solutions were proprietary and closed systems. They didn't partner with organizations such as W3C to take a standards-driven approach.

  • What are some use cases and current adoptions of Web Annotation?
    Showing the delivery of media and annotations via standard APIs. Source: DLCS 2016.
    Showing the delivery of media and annotations via standard APIs. Source: DLCS 2016.

    Annotations can be used to "provide a trace of use; third party commentary; information sharing; information filtering; semantic labeling of document content; and enhanced search". Annotations can help readers discover new content by subscribing to annotation feeds. They can also share annotations, thereby creating communities of common interests. Publishers can use annotations to add value to their content.

    Many systems are using annotations for online collaboration. Examples include Kami and GoodReader. For greater insight, DocumentCloud promises to turn documents into data.

    For engaging online communities in open discussions, RapGenius started in 2009 to annotate rap lyrics but later expanded to other topics and changed its name to Genius in 2014. Publisher John Wiley & Sons, Inc. and IIIF Consortium have adopted annotations. Mendeley (acquired by Elsevier) provides research management system that includes annotations.

    To flag fake news, annotations can help in building an ecosystem for fact checking. Web Annotation can enable "decentralized, trustworthy mechanism for fact checking and public discussion". Beyond facts, Climate Feedback is using annotations for content reasoning and argumentation.

  • What annotation tools or services are currently available?
    A number of annotation tools across time. Source: dwhly 2011, slide 6.
    A number of annotation tools across time. Source: dwhly 2011, slide 6.

    Since the early 1990s, many systems have come and gone to support annotations on the web. Hypothes.is has shared a list of annotation efforts. Among the more widely used services are Diigo, Mendeley, DocumentCloud, RapGenius, Good Reader and Notable PDF. A curated list of ten annotation tools from April 2017 includes zipBoard, UserSnap, PageProofer, JIRA Capture, BugHerd, Scrible, Hypothesis, Diigo, TrackDuck and Twiddla.

    Mosaic browser that was released in 1993 had support for annotations. Third Voice was an annotation service during 1999-2001. Predating this, there have been other open-source annotation apps: CritSuite, JotBot, ComMentor and Xanadu. Angel-funded Fleck existed during 2006-2008. At the 2013 I Annotate Conference, many new services were presented: Domeo, Maphub, Pelagios, Authorea, dotdotdot, Hypothes.is.

  • What's the architecture behind annotations?
    An annotation server responds to HTTP GET/POST requests. Source: Koivunen et al. 2000, fig. 4.1.
    An annotation server responds to HTTP GET/POST requests. Source: Koivunen et al. 2000, fig. 4.1.

    There are two ways to implement annotations:

    • Proxy-based: A proxy server merges web content and annotations. Browsers see the merged content. CritLink and InterNote are examples.
    • Browser-based: Original web content and annotations are merged at the browser end. This may be done via a browser plugin or by JavaScript served from the annotation server. Third Voice is an example of a plugin. JotBot uses a Java applet. Yawas uses internal DOM events to display annotations.

    Proxy-based approach is restrictive since readers must access a IRI different from the original one. Browser-based approach is preferred. As examples, Hypothes.is and Pundit take the latter approach. It's expected that eventually browsers will natively support Web Annotation.

    It's important to note that annotations are not necessarily stored at the publisher's server. They could be stored in a separate annotation server. Any third party can offer an annotation service that collects and organizes annotations. Such a service must be neutral to realize the ideals of an open collaborative web.

  • Could you briefly explain the Web Annotation Data Model?
    An annotation is a set of connected resources. Source: Sanderson et al. 2017, sec. 1.
    An annotation is a set of connected resources. Source: Sanderson et al. 2017, sec. 1.

    An annotation is a rooted directed graph that establishes relationships among resources. A resource can be either a Body or a Target. Annotations can have 0 or more bodies and 1 or more targets. Content of the body are "about" the target. An annotation, body or target may have its own properties and relationship such as creation or descriptive information.

    Since resources are distributed on the web, they are identified using IRI. In some cases, Body is just textual content and may be included as part of the annotation. In such cases, separate IRI is not required for the Body.

    We often want to select part of a resource and not the entire content at the IRI. This is called Segment (of Interest). A Selector is used to extract the segment from a resource. For example, selectors are available to select some region of an image; an exact quote; content that matches a CSS or XPath rule; some text by its start and end positions; and so on.

  • What are the W3C Recommendations for Web Annotation?

    We have the following Recommendations and Working Group Notes:

  • What are the tools available to implement Web Annotation?
    Screenshot showing Hypothesis-enabled annotations appearing as an expandable side bar. Source: Udell 2018a.
    Screenshot showing Hypothesis-enabled annotations appearing as an expandable side bar. Source: Udell 2018a.

    Annotator is an open-source JavaScript library that can be added to any website and thus enable annotations on it. Annotator can also be extended with plugins. Annotorious is one such plugin for image annotation. Annotator is being used in many projects: Hypothes.is, Harvard's Open Video Annotation Project, EdX, MIT's Annotation Studio, WritingPod, Crunched Book and many more.

    Hypothesis is based on Annotator and is also open source. While Annotator used to store annotations at annotateit.org, this service closed in March 2017. Hypothesis is also a service that stores annotations but you can deploy your own servers instead.

    Pundit Annotator is based on AngularJS. It comes with open source client code. You can be download and install the server code.

  • What are the elements of an interoperable annotations layer?
    Annotations can be delivered interactively or embedded directly as footnotes. Source: Udell 2017.
    Annotations can be delivered interactively or embedded directly as footnotes. Source: Udell 2017.

    Annotations may be considered as an extra layer of information on top of web content. To accelerate the development of a pervasive interoperable annotation layer, the Annotating All Knowledge Coalition identified what such a layer should contain.

    An interoperable annotations layer should be standardized. It should be an open framework. Annotations should be as granular as possible: parts of an image, specific lines of text, etc. It should support discovery and linking of annotations across different content formats (HTML or PDF). Likewise, content may have different versions and annotations should persist across versions.

    Annotations can be private or public. Identities should be managed. To allow discovery and sharing of annotations, standard identifiers should be used. Notifications should be possible, such as notifying publishers when their content is annotated.

    Readers should be able to selectively choose which annotations they wish to see when reading content. Annotations should be portable across services. It should be possible to link to other resources or programmatically annotate. Further tools can enable tagging and filtering of annotations.

  • What are the challenges ahead for Web Annotation?

    Interoperatibility is a challenge. Annotations created with one tool should be compatible with another. In other words, can we export annotations from one and import into another? When a browser has multiple tools installed, these tools should coexist in terms creating, editing and anchoring annotations. One study annotated a webpage with both Hypothesis and Pundit. It then found that the tools are incompatible and somewhat conflicting.

Milestones

1989

Tim Berners-Lee publishes his ideas about the World Wide Web and hypertext. This paper includes thoughts about annotations, some of which could be kept private. Thus, the idea of annotating web content is not something that came later.

1993

The first widely distributed web browser Mosaic is released by Marc Andreessen and Eric Bina. Mosaic v1.1 includes client-side support for annotations. However, this is considered as a "nice-to-have" feature and is not developed further.

1999

As a browser plugin, Third Voice is launched. It allows users to annotate any website and thereby have open discussions without being controlled by content publishers. However, this service gathers bad reputation for enabling "web graffiti". Getting users to download and install the plugin becomes a barrier. The service closes in April 2001.

Jul
2001

Where Third Voice fails, W3C attempts with Annotea that's built into W3C's Amaya browser. W3C proposes to adopt and reuse concepts and tools from Semantic Web, such as RDF and XML.

2009

Two groups, Annotation Ontology and the Open Annotation Collaboration, independently start work on annotation specifications. In 2011, they join together to form W3C Open Annotation Community Group.

Jul
2011

Version 1.0.0 of Annotator is released. This is a JavaScript library for building annotation applications in browsers. Version 1.2.10 is released in February 2015.

Oct
2011

As an open source non-profit platform, Hypothes.is is announced. It aims to be community moderated, neutral and transparent.

Feb
2013

Open Annotation Community Group at W3C publishes a Community Draft titled Open Annotation Data Model. This specifies an RDF-based data model for exchanging annotations between applications. This is not a standard but forms the starting point for future standards.

Apr
2013

What is today an annual event, the first I Annotate Conference is held in San Francisco. Dan Whatley of Hypothes.is introduces,

Annotation as the 3D printing of the web: the promise of decentralization of knowledge creation online.
Oct
2014

Built on Annotator, Hypothesis application is launched as a Chrome extension. An alpha version was launched earlier in 2013. In April 2018, they achieve a milestone of 3 million annotations.

Feb
2017

W3C Web Annotation Working Group publishes three Recommendations that define Web Annotation. In addition, two Working Group Notes are also published. This Working Group was formed in 2014.

Sample Code

  • // Source: https://www.w3.org/TR/annotation-model/
    // This JSON+LD example says that a PDF document is annotated
    // with an audio MP3 file.
    {
      "@context": "http://www.w3.org/ns/anno.jsonld",
      "id": "http://example.org/anno2",
      "type": "Annotation",
      "body": {
        "id": "http://example.org/analysis1.mp3",
        "format": "audio/mpeg",
        "language": "fr"
      },
      "target": {
        "id": "http://example.gov/patent1.pdf",
        "format": "application/pdf",
        "language": ["en", "ar"],
        "textDirection": "ltr",
        "processingLanguage": "en"
      }
    }
     
    // Select a rectangular region of an image
    // #xywh=100,100,300,300 is the fragment selector
    {
      "@context": "http://www.w3.org/ns/anno.jsonld",
      "id": "http://example.org/anno4",
      "type": "Annotation",
      "body": "http://example.org/description1",
      "target": {
        "id": "http://example.com/image1#xywh=100,100,300,300",
        "type": "Image",
        "format": "image/jpeg"
      }
    }
     
    // Source: https://www.infoworld.com/article/3263344/web-development/ \
    //         how-web-annotation-will-transform-content-management.html
    // Note: We have broken long strings into multiple lines for readability
    //       but in fact this is not valid syntax in JSON!
    // Example shows the use of selectors XPathSelector and TextPositionSelector
    // Body itself is empty except for tagging it as EnterpriseAnnotation
    // Body is also not a separate IRI: it's embedded into the annotation itself
    {
      "body": [{
        "type": "TextualBody",
        "value": "",
        "format": "text/markdown"
      }, {
        "type": "TextualBody",
        "purpose": "tagging",
        "value": "EnterpriseAnnotation"
      }],
      "target": [{
        "source": "https://www.aihw.gov.au/reports/cancer/\
        cancer-compendium-information-and-trends-by-cancer-type\
        /report-contents/pancreatic-cancer-in-australia",
        {
          "type": "XPathSelector",
          "value": "/form[1]/div[4]/main[1]/div[3]/div[2]/div[1]/div[1]/div[1]/\
          div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/p[3]",
          "refinedBy": {
            "start": 0,
            "end": 243,
            "type": "TextPositionSelector"
          }
        }, {
          "type": "TextPositionSelector",
          "end": 15827,
          "start": 15584
        }, {
          "exact": "In 2013, there were 2,865 new cases of pancreatic cancer diagnosed in Australia \
          (1,490 males and 1,374 females). In 2017, it is estimated that 3,271 new cases of pancreatic \
          cancer will be diagnosed in Australia (1,722 males and 1,548 females).",
          "prefix": "only diagnosed cancer in 2017.\n\n",
          "type": "TextQuoteSelector",
          "suffix": "\n\nIn 2013, the age-standardised "
        }]
      }],
      "created": "2017-11-28T18:56:04.889815+00:00",
      "@context": "http://www.w3.org/ns/anno.jsonld",
      "creator": "acct:judell@hypothes.is",
      "type": "Annotation",
      "id": "https://hypothes.is/a/yW9M1tRtEee1Q-typwRX4w",
      "modified": "2017-11-28T18:56:04.889815+00:00"
    }
     

References

  1. Angell, Nate. 2018. "Meet the Hypothesis Team That Made 3 Million Annotations Possible." Hypothes.is Blog, April 30. Accessed 2018-06-22.
  2. Annotator. 2018a. "Home - Annotator - Annotating the Web." Accessed 2018-06-22.
  3. Annotator. 2018b. "Who's Using It?" Accessed 2018-06-22.
  4. C2 Wiki. 2013. "Web Annotation." August 31. Accessed 2018-06-22.
  5. Carpenter, Todd A. 2013. "iAnnotate — Whatever Happened to the Web as an Annotation System?" The Scholarly Kitchen, April 30. Accessed 2018-06-22.
  6. DLCS. 2016. "DLCS Delivery Architecture." Introduction to the DLCS, Digital Library Cloud Services, December 07. Updated 2017-03-13. Accessed 2018-06-25.
  7. Farley, Tim. 2011. "Hypothes.is could become a crucial tool for skeptics." Skeptical Software Tools, October 21. Accessed 2018-06-22.
  8. Genius Founders. 2014. "Introducing Genius.com." July 12. Accessed 2018-06-22.
  9. Herman, Ivan, Robert Sanderson, Paolo Ciccarese, and Benjamin Young, eds. 2017. "Selectors and States." W3C Reference Note, February 23. Accessed 2018-06-22.
  10. Hypothes.is. 2018a. "Historical Survey of Annotation Efforts." Accessed 2018-06-22.
  11. Hypothes.is. 2018b. "Our Principles." Accessed 2018-06-22.
  12. Hypothes.is. 2018c. "What is the difference between Hypothesis and AnnotatorJS?" Accessed 2018-06-22.
  13. Koivunen, Marja-Riitta, Dan Brickley, José Kahan, Eric Prud'Hommeaux, and Ralph R. Swick. 2000. "The W3C Collaborative Web Annotation Project ... or how to have fun while building an RDF infrastructure." W3C, May 12. Accessed 2018-06-22.
  14. Kumar, Aparna. 2001. "Third Voice Trails Off..." Wired, April 04. Accessed 2018-06-22.
  15. Lumpkin, Matt. 2013. "I annotate conference." Blog, April 10. Updated 2013-04-14. Accessed 2018-06-22.
  16. Martone, Maryann. 2016. "Attributes of an Interoperable Annotation." FORCE11, May 9. Accessed 2018-06-22.
  17. Martone, Maryann. 2017. "Annotating all Knowledge: Adventures in Interoperability." February 09. Accessed 2018-06-22.
  18. McDonald, Glenn. 2001. "A Standard for e-Comments." MIT Technology Review, July 16. Accessed 2018-06-22.
  19. Net7 GitHub. 2015. "net7/pundit2." October 29. Accessed 2018-06-22.
  20. Open Annotation GitHub. 2015. "annotator tags." July 3. Accessed 2018-06-22.
  21. Sanderson, Robert, and Timothy Cole. 2017. "Making it easier to share annotations on the web." W3C GitHub Blog, February 23. Accessed 2018-06-22.
  22. Sanderson, Robert, Paolo Ciccarese, and Herbert Van de Sompel. 2013. "Open Annotation Data Model." W3C Community Draft, February 08. Accessed 2018-06-22.
  23. Sanderson, Robert, Paolo Ciccarese, and Benjamin Young, eds. 2017. "Web Annotation Data Model." W3C Recommendation, February 23. Accessed 2018-06-22.
  24. Saxena, Rishabh. 2017. "Top 10 Website Annotation Tools." Mopinion, April 18. Accessed 2018-06-22.
  25. Udell, Jon. 2017. "How annotation layers define 'segments of interest' for new kinds of applications." Blog, February 08. Accessed 2018-06-22.
  26. Udell, Jon. 2018a. "How web annotation can help readers spot fact-checked claims." MisinfoCon, May 18. Accessed 2018-06-22.
  27. W3C. 2018a. "Web Annotation Working Group." Accessed 2018-06-22.
  28. W3C. 2018b. "Web Annotation Architecture." SVG document. Accessed 2018-06-22.
  29. Web Annotation WG. 2018. "Documents." Accessed 2018-06-22.
  30. Willyard, Cassandra. 2018. "At Climate Feedback, scientists encourage better science reporting. But who is listening?" Columbia Journalism Review, February 01. Accessed 2018-06-22.
  31. dwhly. 2011. "Hypothesis quick overview 2011-10-19." SlideShare, July 10. Accessed 2018-06-22.
  32. dwhly. 2014. "Launch!" Hypothes.is Blog, October 27. Accessed 2018-06-22.
  33. dwhly. 2017. "Annotation Is Now a Web Standard." Hypothes.is Blog, February 24. Accessed 2018-06-22.
  34. pbrantley. 2013. "I Annotate 2013: Our Take." Hypothes.is Blog, April 18. Accessed 2018-06-22.

Further Reading

  1. Sanderson, Robert, Paolo Ciccarese, and Benjamin Young, eds. 2017. "Web Annotation Data Model." W3C Recommendation, February 23. Accessed 2018-06-22.
  2. Udell, Jon. 2018b. "How web annotation will transform content management." InfoWorld, March 21. Accessed 2018-06-22.
  3. Perton, Marc. 2016. "With Web Annotation, You Can Comment on Any Page—But Should You?" Newsweek, April 30. Accessed 2018-06-22.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
1
1
2183
1
0
16
1
0
12
1904
Words
8
Likes
13K
Hits

Cite As

Devopedia. 2022. "Web Annotation." Version 3, February 15. Accessed 2024-06-25. https://devopedia.org/web-annotation
Contributed by
3 authors


Last updated on
2022-02-15 11:48:10
  • Resource Description Framework
  • Hypothesis (Annotator)
  • Annotator (JavaScript)
  • Semantic Web