Learning Center

Annotating Texts

What is annotation.

Annotation can be:

  • A systematic summary of the text that you create within the document
  • A key tool for close reading that helps you uncover patterns, notice important words, and identify main points
  • An active learning strategy that improves comprehension and retention of information

Why annotate?

  • Isolate and organize important material
  • Identify key concepts
  • Monitor your learning as you read
  • Make exam prep effective and streamlined
  • Can be more efficient than creating a separate set of reading notes

How do you annotate?

Summarize key points in your own words .

  • Use headers and words in bold to guide you
  • Look for main ideas, arguments, and points of evidence
  • Notice how the text organizes itself. Chronological order? Idea trees? Etc.

Circle key concepts and phrases

  • What words would it be helpful to look-up at the end?
  • What terms show up in lecture? When are different words used for similar concepts? Why?

Write brief comments and questions in the margins

  • Be as specific or broad as you would like—use these questions to activate your thinking about the content
  • See our handout on reading comprehension tips for some examples

Use abbreviations and symbols

  • Try ? when you have a question or something you need to explore further
  • Try ! When something is interesting, a connection, or otherwise worthy of note
  • Try * For anything that you might use as an example or evidence when you use this information.
  • Ask yourself what other system of symbols would make sense to you.

Highlight/underline

  • Highlight or underline, but mindfully. Check out our resource on strategic highlighting for tips on when and how to highlight.

Use comment and highlight features built into pdfs, online/digital textbooks, or other apps and browser add-ons

  • Are you using a pdf? Explore its highlight, edit, and comment functions to support your annotations
  • Some browsers have add-ons or extensions that allow you to annotate web pages or web-based documents
  • Does your digital or online textbook come with an annotation feature?
  • Can your digital text be imported into a note-taking tool like OneNote, EverNote, or Google Keep? If so, you might be able to annotate texts in those apps

What are the most important takeaways?

  • Annotation is about increasing your engagement with a text
  • Increased engagement, where you think about and process the material then expand on your learning, is how you achieve mastery in a subject
  • As you annotate a text, ask yourself: how would I explain this to a friend?
  • Put things in your own words and draw connections to what you know and wonder

The table below demonstrates this process using a geography textbook excerpt (Press 2004):

A chart featuring a passage from a text in the left column and then columns that illustrate annotations that include too much writing, not enough writing, and a good balance of writing.

A common concern about annotating texts: It takes time!

Yes, it can, but that time isn’t lost—it’s invested.

Spending the time to annotate on the front end does two important things:

  • It saves you time later when you’re studying. Your annotated notes will help speed up exam prep, because you can review critical concepts quickly and efficiently.
  • It increases the likelihood that you will retain the information after the course is completed. This is especially important when you are supplying the building blocks of your mind and future career.

One last tip: Try separating the reading and annotating processes! Quickly read through a section of the text first, then go back and annotate.

Works consulted:

Nist, S., & Holschuh, J. (2000). Active learning: strategies for college success. Boston: Allyn and Bacon. 202-218.

Simpson, M., & Nist, S. (1990). Textbook annotation: An effective and efficient study strategy for college students. Journal of Reading, 34: 122-129.

Press, F. (2004). Understanding earth (4th ed). New York: W.H. Freeman. 208-210.

Creative Commons License

Make a Gift

annotations good practice

How to Annotate Texts

Use the links below to jump directly to any section of this guide:

Annotation Fundamentals

How to start annotating , how to annotate digital texts, how to annotate a textbook, how to annotate a scholarly article or book, how to annotate literature, how to annotate images, videos, and performances, additional resources for teachers.

Writing in your books can make you smarter. Or, at least (according to education experts), annotation–an umbrella term for underlining, highlighting, circling, and, most importantly, leaving comments in the margins–helps students to remember and comprehend what they read. Annotation is like a conversation between reader and text. Proper annotation allows students to record their own opinions and reactions, which can serve as the inspiration for research questions and theses. So, whether you're reading a novel, poem, news article, or science textbook, taking notes along the way can give you an advantage in preparing for tests or writing essays. This guide contains resources that explain the benefits of annotating texts, provide annotation tools, and suggest approaches for diverse kinds of texts; the last section includes lesson plans and exercises for teachers.

Why annotate? As the resources below explain, annotation allows students to emphasize connections to material covered elsewhere in the text (or in other texts), material covered previously in the course, or material covered in lectures and discussion. In other words, proper annotation is an organizing tool and a time saver. The links in this section will introduce you to the theory, practice, and purpose of annotation. 

How to Mark a Book, by Mortimer Adler

This famous, charming essay lays out the case for marking up books, and provides practical suggestions at the end including underlining, highlighting, circling key words, using vertical lines to mark shifts in tone/subject, numbering points in an argument, and keeping track of questions that occur to you as you read. 

How Annotation Reshapes Student Thinking (TeacherHUB)

In this article, a high school teacher discusses the importance of annotation and how annotation encourages more effective critical thinking.

The Future of Annotation (Journal of Business and Technical Communication)

This scholarly article summarizes research on the benefits of annotation in the classroom and in business. It also discusses how technology and digital texts might affect the future of annotation. 

Annotating to Deepen Understanding (Texas Education Agency)

This website provides another introduction to annotation (designed for 11th graders). It includes a helpful section that teaches students how to annotate reading comprehension passages on tests.

Once you understand what annotation is, you're ready to begin. But what tools do you need? How do you prepare? The resources linked in this section list strategies and techniques you can use to start annotating. 

What is Annotating? (Charleston County School District)

This resource gives an overview of annotation styles, including useful shorthands and symbols. This is a good place for a student who has never annotated before to begin.

How to Annotate Text While Reading (YouTube)

This video tutorial (appropriate for grades 6–10) explains the basic ins and outs of annotation and gives examples of the type of information students should be looking for.

Annotation Practices: Reading a Play-text vs. Watching Film (U Calgary)

This blog post, written by a student, talks about how the goals and approaches of annotation might change depending on the type of text or performance being observed. 

Annotating Texts with Sticky Notes (Lyndhurst Schools)

Sometimes students are asked to annotate books they don't own or can't write in for other reasons. This resource provides some strategies for using sticky notes instead.

Teaching Students to Close Read...When You Can't Mark the Text (Performing in Education)

Here, a sixth grade teacher demonstrates the strategies she uses for getting her students to annotate with sticky notes. This resource includes a link to the teacher's free Annotation Bookmark (via Teachers Pay Teachers).

Digital texts can present a special challenge when it comes to annotation; emerging research suggests that many students struggle to critically read and retain information from digital texts. However, proper annotation can solve the problem. This section contains links to the most highly-utilized platforms for electronic annotation.

Evernote is one of the two big players in the "digital annotation apps" game. In addition to allowing users to annotate digital documents, the service (for a fee) allows users to group multiple formats (PDF, webpages, scanned hand-written notes) into separate notebooks, create voice recordings, and sync across all sorts of devices. 

OneNote is Evernote's main competitor. Reviews suggest that OneNote allows for more freedom for digital note-taking than Evernote, but that it is slightly more awkward to import and annotate a PDF, especially on certain platforms. However, OneNote's free version is slightly more feature-filled, and OneNote allows you to link your notes to time stamps on an audio recording.

Diigo is a basic browser extension that allows a user to annotate webpages. Diigo also offers a Screenshot app that allows for direct saving to Google Drive.

While the creators of Hypothesis like to focus on their app's social dimension, students are more likely to be interested in the private highlighting and annotating functions of this program.

Foxit PDF Reader

Foxit is one of the leading PDF readers. Though the full suite must be purchased, Foxit offers a number of annotation and highlighting tools for free.

Nitro PDF Reader

This is another well-reviewed, free PDF reader that includes annotation and highlighting. Annotation, text editing, and other tools are included in the free version.

Goodreader is a very popular Mac-only app that includes annotation and editing tools for PDFs, Word documents, Powerpoint, and other formats.

Although textbooks have vocabulary lists, summaries, and other features to emphasize important material, annotation can allow students to process information and discover their own connections. This section links to guides and video tutorials that introduce you to textbook annotation. 

Annotating Textbooks (Niagara University)

This PDF provides a basic introduction as well as strategies including focusing on main ideas, working by section or chapter, annotating in your own words, and turning section headings into questions.

A Simple Guide to Text Annotation (Catawba College)

The simple, practical strategies laid out in this step-by-step guide will help students learn how to break down chapters in their textbooks using main ideas, definitions, lists, summaries, and potential test questions.

Annotating (Mercer Community College)

This packet, an excerpt from a literature textbook, provides a short exercise and some examples of how to do textbook annotation, including using shorthand and symbols.

Reading Your Healthcare Textbook: Annotation (Saddleback College)

This powerpoint contains a number of helpful suggestions, especially for students who are new to annotation. It emphasizes limited highlighting, lots of student writing, and using key words to find the most important information in a textbook. Despite the title, it is useful to a student in any discipline.

Annotating a Textbook (Excelsior College OWL)

This video (with included transcript) discusses how to use textbook features like boxes and sidebars to help guide annotation. It's an extremely helpful, detailed discussion of how textbooks are organized.

Because scholarly articles and books have complex arguments and often depend on technical vocabulary, they present particular challenges for an annotating student. The resources in this section help students get to the heart of scholarly texts in order to annotate and, by extension, understand the reading.

Annotating a Text (Hunter College)

This resource is designed for college students and shows how to annotate a scholarly article using highlighting, paraphrase, a descriptive outline, and a two-margin approach. It ends with a sample passage marked up using the strategies provided. 

Guide to Annotating the Scholarly Article (ReadWriteThink.org)

This is an effective introduction to annotating scholarly articles across all disciplines. This resource encourages students to break down how the article uses primary and secondary sources and to annotate the types of arguments and persuasive strategies (synthesis, analysis, compare/contrast).

How to Highlight and Annotate Your Research Articles (CHHS Media Center)

This video, developed by a high school media specialist, provides an effective beginner-level introduction to annotating research articles. 

How to Read a Scholarly Book (AndrewJacobs.org)

In this essay, a college professor lets readers in on the secrets of scholarly monographs. Though he does not discuss annotation, he explains how to find a scholarly book's thesis, methodology, and often even a brief literature review in the introduction. This is a key place for students to focus when creating annotations. 

A 5-step Approach to Reading Scholarly Literature and Taking Notes (Heather Young Leslie)

This resource, written by a professor of anthropology, is an even more comprehensive and detailed guide to reading scholarly literature. Combining the annotation techniques above with the reading strategy here allows students to process scholarly book efficiently. 

Annotation is also an important part of close reading works of literature. Annotating helps students recognize symbolism, double meanings, and other literary devices. These resources provide additional guidelines on annotating literature.

AP English Language Annotation Guide (YouTube)

In this ~10 minute video, an AP Language teacher provides tips and suggestions for using annotations to point out rhetorical strategies and other important information.

Annotating Text Lesson (YouTube)

In this video tutorial, an English teacher shows how she uses the white board to guide students through annotation and close reading. This resource uses an in-depth example to model annotation step-by-step.

Close Reading a Text and Avoiding Pitfalls (Purdue OWL)

This resources demonstrates how annotation is a central part of a solid close reading strategy; it also lists common mistakes to avoid in the annotation process.

AP Literature Assignment: Annotating Literature (Mount Notre Dame H.S.)

This brief assignment sheet contains suggestions for what to annotate in a novel, including building connections between parts of the book, among multiple books you are reading/have read, and between the book and your own experience. It also includes samples of quality annotations.

AP Handout: Annotation Guide (Covington Catholic H.S.)

This annotation guide shows how to keep track of symbolism, figurative language, and other devices in a novel using a highlighter, a pencil, and every part of a book (including the front and back covers).

In addition to written resources, it's possible to annotate visual "texts" like theatrical performances, movies, sculptures, and paintings. Taking notes on visual texts allows students to recall details after viewing a resource which, unlike a book, can't be re-read or re-visited ( for example, a play that has finished its run, or an art exhibition that is far away). These resources draw attention to the special questions and techniques that students should use when dealing with visual texts.

How to Take Notes on Videos (U of Southern California)

This resource is a good place to start for a student who has never had to take notes on film before. It briefly outlines three general approaches to note-taking on a film. 

How to Analyze a Movie, Step-by-Step (San Diego Film Festival)

This detailed guide provides lots of tips for film criticism and analysis. It contains a list of specific questions to ask with respect to plot, character development, direction, musical score, cinematography, special effects, and more. 

How to "Read" a Film (UPenn)

This resource provides an academic perspective on the art of annotating and analyzing a film. Like other resources, it provides students a checklist of things to watch out for as they watch the film.

Art Annotation Guide (Gosford Hill School)

This resource focuses on how to annotate a piece of art with respect to its formal elements like line, tone, mood, and composition. It contains a number of helpful questions and relevant examples. 

Photography Annotation (Arts at Trinity)

This resource is designed specifically for photography students. Like some of the other resources on this list, it primarily focuses on formal elements, but also shows students how to integrate the specific technical vocabulary of modern photography. This resource also contains a number of helpful sample annotations.

How to Review a Play (U of Wisconsin)

This resource from the University of Wisconsin Writing Center is designed to help students write a review of a play. It contains suggested questions for students to keep in mind as they watch a given production. This resource helps students think about staging, props, script alterations, and many other key elements of a performance.

This section contains links to lessons plans and exercises suitable for high school and college instructors.

Beyond the Yellow Highlighter: Teaching Annotation Skills to Improve Reading Comprehension (English Journal)

In this journal article, a high school teacher talks about her approach to teaching annotation. This article makes a clear distinction between annotation and mere highlighting.

Lesson Plan for Teaching Annotation, Grades 9–12 (readwritethink.org)

This lesson plan, published by the National Council of Teachers of English, contains four complete lessons that help introduce high school students to annotation.

Teaching Theme Using Close Reading (Performing in Education)

This lesson plan was developed by a middle school teacher, and is aligned to Common Core. The teacher presents her strategies and resources in comprehensive fashion.

Analyzing a Speech Using Annotation (UNC-TV/PBS Learning Media)

This complete lesson plan, which includes a guide for the teacher and relevant handouts for students, will prepare students to analyze both the written and presentation components of a speech. This lesson plan is best for students in 6th–10th grade.

Writing to Learn History: Annotation and Mini-Writes (teachinghistory.org)

This teaching guide, developed for high school History classes, provides handouts and suggested exercises that can help students become more comfortable with annotating historical sources.

Writing About Art (The College Board)

This Prezi presentation is useful to any teacher introducing students to the basics of annotating art. The presentation covers annotating for both formal elements and historical/cultural significance.

Film Study Worksheets (TeachWithMovies.org)

This resource contains links to a general film study worksheet, as well as specific worksheets for novel adaptations, historical films, documentaries, and more. These resources are appropriate for advanced middle school students and some high school students. 

Annotation Practice Worksheet (La Guardia Community College)

This worksheet has a sample text and instructions for students to annotate it. It is a useful resource for teachers who want to give their students a chance to practice, but don't have the time to select an appropriate piece of text. 

  • PDFs for all 136 Lit Terms we cover
  • Downloads of 1872 LitCharts Lit Guides
  • Teacher Editions for every Lit Guide
  • Explanations and citation info for 39,345 quotes across 1872 books
  • Downloadable (PDF) line-by-line translations of every Shakespeare play

Need something? Request a new guide .

How can we improve? Share feedback .

LitCharts is hiring!

The LitCharts.com logo.

7 Strategies for Teaching Students How to Annotate

  • November 7, 2018

For many educators, annotation goes hand in hand with developing close reading skills. Annotation more fully engages students and increases reading comprehension strategies, helping students develop a deeper understanding and appreciation for literature.

However, it’s also one of the more difficult skills to teach. In order to think critically about a text, students need to learn how to actively engage with the text they’re reading. Annotation provides that immersive experience, and new digital reading technologies not only make annotation easier than ever, but also make it possible for any book, article, or text to be annotated.

Below are seven strategies to help your students master the basics of annotation and become more engaged, closer readers.

1. Teach the Basics of Good Annotation

Help your students understand that annotation is simply the process of thoughtful reading and making notes as they study a text. Start with some basic forms of annotation:

  • highlighting a phrase or sentence and including a comment
  • circling a word that needs defining
  • posing a question when something isn’t fully understood
  • writing a short summary of a key section

Assure them that good annotating will help them concentrate and better understand what they read and better remember their thoughts and ideas when they revisit the text.

2. Model Effective Annotation

One of the most effective ways to teach annotation is to show students your own thought process when annotating a text. Display a sample text and think out loud as you make notes. Show students how you might underline key words or sentences and write comments or questions, and explain what you’re thinking as you go through the reading and annotation process.

Annotation Activity: Project a short, simple text and let students come up and write their own comments and discuss what they’ve written and why. This type of modeling and interaction helps students understand the thought process that critical reading requires.

3. Give Your Students a Reading Checklist

When first teaching students about annotation, you can help shape their critical analysis and active reading strategies by giving them specific things to look for while reading, like a checklist or annotation worksheet for a text. You might have them explain how headings and subheads connect with the text, or have them identify facts that add to their understanding.

4. Provide an Annotation Rubric

When you know what your annotation goals are for your students, it can be useful to develop a simple rubric that defines what high-quality and thoughtful annotation looks like. This provides guidance for your students and makes grading easier for you. You can modify your rubric as goals and students’ needs change over time.

5. Keep It Simple

Especially for younger or struggling readers, help your students develop self-confidence by keeping things simple. Ask them to circle a word they don’t know, look up that word in the dictionary, and write the definition in a comment. They can also write an opinion on a particular section, so there’s no right or wrong answer.

6. Teach Your Students How to Annotate a PDF

Or other digital texts. Most digital reading platforms include a number of tools that make annotation easy. These include highlighters, text comments, sticky notes, mark up tools for underlining, circling, or drawing boxes, and many more. If you don’t have a digital reading platform, you can also teach how to annotate a basic PDF text using simple annotation tools like highlights or comments.

7. Make It Fun!

The more creative you get with annotation, the more engaged your students will be. So have some fun with it!

  • Make a scavenger hunt by listing specific components to identify
  • Color code concepts and have students use multicolored highlighters
  • Use stickers to represent and distinguish the five story elements: character, setting, plot, conflict, and theme
  • Choose simple symbols to represent concepts, and let students draw those as illustrated annotations: a magnifying glass could represent clues in the text, a key an important idea, and a heart could indicate a favorite part

Annotation Activity: Create a dice game where students have to find concepts and annotate them based on the number they roll. For example, 1 = Circle and define a word you don’t know, 2 = Underline a main character, 3 = Highlight the setting, etc.

Teaching students how to annotate gives them an invaluable tool for actively engaging with a text. It helps them think more critically, it increases retention, and it instills confidence in their ability to analyze more complex texts.

More Resources articles

annotations good practice

Ideas to Celebrate National Library Week and Encourage a Young Writer Day 2024

Inspiring young children to read, share stories, and write can help them build skills that will stay with them for years to come. April is

annotations good practice

15 of the Best Math Picture Books for Kids

Math is around us everywhere, from the addition used when counting toys to the geometry of spotting shapes in the clouds. When you’re making a

annotations good practice

Family Guide: Early Learning & Development Standards by Grade

Back-to-school season is such an exciting time for young learners. It’s the beginning of a year full of new milestones to come, including learning skills,

annotations good practice

End Bullying: October is National Bullying Prevention Month

annotations good practice

Six Picture Books & Chapter Book Guides to Celebrate Black History Month with Young Students

annotations good practice

MacKenzie Scott’s Yield Giving Awards Waterford.org a $10 Million Grant

Writers' Center

Eastern Washington University

Reading and Study Strategies

What is annotating and why do it, annotation explained, steps to annotating a source, annotating strategies.

  • Using a Dictionary
  • Study Skills

[ Back to resource home ]

An image of writing consultants meeting with students.

[email protected] 509.359.2779

Cheney Campus   JFK Library Learning Commons

Spokane Campus Catalyst Building C451 and C452

Stay Connected! Instagram  Facebook

Helpful Links

Software for Annotating

ProQuest Flow (sign up with your EWU email)

FoxIt PDF Reader

Adobe Reader Pro  - available on all campus computers

Track Changes in Microsoft Word

What is Annotating?

Annotating is any action that deliberately interacts with a text to enhance the reader's understanding of, recall of, and reaction to the text. Sometimes called "close reading," annotating usually involves highlighting or underlining key pieces of text and making notes in the margins of the text. This page will introduce you to several effective strategies for annotating a text that will help you get the most out of your reading.

Why Annotate?

By annotating a text, you will ensure that you understand what is happening in a text after you've read it. As you annotate, you should note the author's main points, shifts in the message or perspective of the text, key areas of focus, and your own thoughts as you read. However, annotating isn't just for people who feel challenged when reading academic texts. Even if you regularly understand and remember what you read, annotating will help you summarize a text, highlight important pieces of information, and ultimately prepare yourself for discussion and writing prompts that your instructor may give you. Annotating means you are doing the hard work while you read, allowing you to reference your previous work and have a clear jumping-off point for future work.

1. Survey : This is your first time through the reading

You can annotate by hand or by using document software. You can also annotate on post-its if you have a text you do not want to mark up. As you annotate, use these strategies to make the most of your efforts:

  • Include a key or legend on your paper that indicates what each marking is for, and use a different marking for each type of information. Example: Underline for key points, highlight for vocabulary, and circle for transition points.
  • If you use highlighters, consider using different colors for different types of reactions to the text. Example: Yellow for definitions, orange for questions, and blue for disagreement/confusion.
  • Dedicate different tasks to each margin: Use one margin to make an outline of the text (thesis statement, description, definition #1, counter argument, etc.) and summarize main ideas, and use the other margin to note your thoughts, questions, and reactions to the text.

Lastly, as you annotate, make sure you are including descriptions of the text as well as your own reactions to the text. This will allow you to skim your notations at a later date to locate key information and quotations, and to recall your thought processes more easily and quickly.

  • Next: Using a Dictionary >>
  • Last Updated: Jul 21, 2021 3:01 PM
  • URL: https://research.ewu.edu/writers_c_read_study_strategies

"where P(A) is the proportion of time that the coders agree and P(E) is the proportion of times that we would expect them to agree by chance." ( Carletta 1996 : 4).

There is no doubt that annotation tends to be highly labour-intensive and time-consuming to carry out well. This is why it is appropriate to admit, as a final observation, that 'best practice' in corpus annotation is something we should all strive for — but which perhaps few of us will achieve.

9. Getting down to the practical task of annotation

To conclude, it is useful to say something about the practicalities of corpus annotation. Assume, say, that you have a text or a corpus you want to work on, and want to 'get the tags into the text'.

  • It is not necessary to have special software. You can annotate the text using a general-purpose text editor or word processor. But this means the job has to be done by hand, which risks being slow and prone to error.
  • For some purposes, particularly if the corpus is large and is to be made available for general use, it is important to have the annotation validated. That is, the vocabulary of annotation is controlled and is allowed to occur only in syntactically valid ways. A validating tool can be written from scratch, or can use macros for word processors or editors.
  • If you decide to use XML-compliant annotation, this means that you have the option to make use of the increasingly available XML editors. An XML editor, in conjunction with a DTD or schema, can do the job of enforcing well-formedness or validity without any programming of the software, although a high degree of expertise with XML will come in useful.
  • Special tagging software has been developed for large projects — for example the CLAWS tagger and Template Tagger used for the Brown Family or corpora and the BNC. Such programs or packages can be licensed for your own annotation work. (For CLAWS, see the UCREL website http://www.comp.lancs.ac.uk/ucrel/ .)
  • There are tagsets which come with specific software — e.g. the C5, C7 and C8 tagsets for CLAWS, and CHAT for the CHILDES system, which is the de facto standard for language acquisition data.
  • There are more general architectures for handling texts, language data and software systems for building and annotation corpora. The most prominent example of this is GATE ('general architecture for text engineering' http://gate.ac.uk ) developed at the University of Sheffield.

Continue to Chapter Three: Metadata for corpus work

Return to the table of contents

© Geoffrey Leech 2004. The right of Geoffrey Leech to be identified as the Author of this Work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service.

Electronic or print copies may not be offered, whether for sale or otherwise, to any third party.

Skip to content. | Skip to navigation

Masterlinks

  • About Hunter
  • One Stop for Students
  • Make a Gift
  • Access the Student Guide
  • Apply to Become a Peer Tutor
  • Access the Faculty Guide
  • Request a Classroom Visit
  • Refer a Student to the Center
  • Request a Classroom Workshop
  • The Writing Process
  • The Documented Essay/Research Paper
  • Writing for English Courses
  • Writing Across the Curriculum
  • Grammar and Mechanics
  • Business and Professional Writing
  • CUNY TESTING
  • | Workshops
  • Research Information and Resources
  • Evaluating Information Sources
  • Writing Tools and References
  • Reading Room
  • Literary Resources
  • ESL Resources for Students
  • ESL Resources for Faculty
  • Teaching and Learning
  • | Contact Us

Annotating a text, or marking the pages with notes, is an excellent, if not essential, way to make the most out of the reading you do for college courses. Annotations make it easy to find important information quickly when you look back and review a text. They help you familiarize yourself with both the content and organization of what you read. They provide a way to begin engaging with ideas and issues directly through comments, questions, associations, or other reactions that occur to you as you read. In all these ways, annotating a text makes the reading process an active one, not just background for writing assignments, but an integral first step in the writing process.

A well-annotated text will accomplish all of the following:

  • clearly identify where in the text important ideas and information are located
  • express the main ideas of a text
  • trace the development of ideas/arguments throughout a text
  • introduce a few of the reader’s thoughts and reactions

Ideally, you should read a text through once before making major annotations. You may just want to circle unfamiliar vocabulary or concepts. This way, you will have a clearer idea about where major ideas and important information are in the text, and your annotating will be more efficient.

A brief description and discussion of four ways of annotating a text— highlighting/underlining, paraphrase/summary of main ideas, descriptive outline, and comments/responses —and a sample annotated text follow:

HIGHLIGHTING/UNDERLINING

Highlighting or underlining key words and phrases or major ideas is the most common form of annotating texts. Many people use this method to make it easier to review material, especially for exams. Highlighting is also a good way of picking out specific language within a text that you may want to cite or quote in a piece of writing. However, over-reliance on highlighting is unwise for two reasons. First, there is a tendency to highlight more information than necessary, especially when done on a first reading. Second, highlighting is the least active form of annotating. Instead of being a way to begin thinking and interacting with ideas in texts, highlighting can become a postponement of that process.

On the other hand, highlighting is a useful way of marking parts of a text that you want to make notes about. And it’s a good idea to highlight the words or phrases of a text that are referred to by your other annotations.

PARAPHRASE/SUMMARY OF MAIN IDEAS

Going beyond locating important ideas to being able to capture their meaning through paraphrase is a way of solidifying your understanding of these ideas. It’s also excellent preparation for any writing you may have to do based on your reading. A series of brief notes in the margins beside important ideas gives you a handy summary right on the pages of the text itself, and if you can take the substance of a sentence or paragraph and condense it into a few words, you should have little trouble clearly demonstrating your understanding of the ideas in question in your own writing.

DESCRIPTIVE OUTLINE

A descriptive outline shows the organization of a piece of writing, breaking it down to show where ideas are introduced and where they are developed. A descriptive outline allows you to see not only where the main ideas are but also where the details, facts, explanations, and other kinds of support for those ideas are located.

A descriptive outline will focus on the function of individual paragraphs or sections within a text. These functions might include any of the following:

  • summarizing a topic/argument/etc.
  • introducing an idea
  • adding explanation
  • giving examples
  • providing factual evidence
  • expanding or limiting the idea
  • considering an opposing view
  • dismissing a contrary view
  • creating a transition
  • stating a conclusion

This list is hardly exhaustive and it’s important to recognize that several of these functions may be repeated within a text, particularly ones that contain more than one major idea.

Making a descriptive outline allows you to follow the construction of the writer’s argument and/or the process of his/her thinking. It helps identify which parts of the text work together and how they do so.

COMMENTS/RESPONSES

You can use annotation to go beyond understanding a text’s meaning and organization by noting your reactions—agreement/disagreement, questions, related personal experience, connection to ideas from other texts, class discussions, etc. This is an excellent way to begin formulating your own ideas for writing assignments based on the text or on any of the ideas it contains.

annotations good practice

Document Actions

  • Public Safety
  • Website Feedback
  • Privacy Policy
  • CUNY Tobacco Policy
  • Share full article

Advertisement

Supported by

Great Ideas From Readers

How Students and Teachers Benefit From Students Annotating Their Own Writing

Three ways to integrate annotation into the writing process that are inspired by our Annotated by the Author series.

annotations good practice

By Matthew Johnson

A couple of years ago, we began a new series called Annotated by the Author, part of our Mentor Texts collection , in which we invite New York Times journalists, and winners of our student contests, to annotate their work, revealing the writing choices they made and explaining why they made them.

That series inspired Matthew Johnson, a writing teacher at Community High School in Ann Arbor, Mich. , to have his students try annotating their own writing. Below, he tells us how this kind of self-annotation can benefit both students and teachers. He also shares three simple, yet impactful, ways students can “talk” to their own work.

If you’d like to learn more about teaching with Annotated by the Author, and our other Times mentor texts, join us at our live webinar on Thursday, Oct. 21, at 4 p.m. Eastern .

And if you have an idea for teaching with The Times, tell us about it here or browse our full collection of Reader Ideas .

— The Learning Network

The first installment of The Learning Network’s Annotated by the Author series, where the science writer Nicholas St. Fleur dissects his article “ Tiny Tyrannosaur Hints at How T. Rex Became King ,” was an instant hit in my classes, and not just because it had a tiny dinosaur. For many students, the window into the motivations, methods and moves of a seasoned writer opened their eyes to what goes into professional writing and what their own writing can be.

Last year, The Learning Network began to have the winners of their student contests annotate their work, and, like the series, my instruction using these annotated pieces grew as well. We used Abel John’s discussion of citing evidence in his editorial “Collar the Cat” to help us define what makes a source useful and reputable. Varya Kluev’s and Elizabeth Phelps’s insights into descriptive writing were just right to seed a conversation about how to artfully extend metaphors. And just this fall, I shared Ananya Udaygiri’s explanation of why she picked Animal Crossing as the topic for her editorial to help some of my seniors pick the right college essay topics for them.

Choosing a Topic to Write About With Ananya Udaygiri

My name is Ananya Udaygiri And I am the author of ‘How Animal Crossing Will Save Gen Z.’ “Generation Z was born in the aftermath of 9/11, molded by the economic recession of 2008 and polished off by the coronavirus, the worst pandemic in a century, and that doesn’t even include the mounting crisis of climate change or the growing nationalism. Or the gun violence epidemic. Gen Z’s childhood is rooted in issues that would be unrecognizable only a decade prior. We are no strangers to a fight. So what drew us to a Japanese video game about living in a village with anthropomorphic animal neighbors? Like moths to a flame, or perhaps more appropriately, like children to their first love, Animal Crossing has captured the young teenage heart.” How did you choose this topic? So I wrote my essay in April, which obviously, the world was ending in April. So you began looking for those little pockets of calm. And within that I found Animal Crossing, which is this game that my friends and I love. I really wanted to talk about that feeling of peace that is so hard to find. And I was scared going in that I was writing about a video game. But ultimately, it was it was passion. It was something that I was passionate about. So I stuck on with it. “In a New York Times article focusing on Animal Crossing in the age of coronavirus, the author described how Animal Crossing was a miniature escape for those isolated by the pandemic. He labeled it as a bomb for the rushing tonnage of real world news. While that is certainly true, for Generation Z it encompasses all that and more.” I was a little relieved and surprised to find something about Animal Crossing, but actually that article really inspired more of my work because as I was reading through it, I noticed that I was thinking, oh, if only they had talked about how this relates to kids, if only they had talked more about Gen Z. And then I realized that, oh, I can do that. So going into my essay, it was inspired by other things I saw and other things that, more importantly, I wanted to see. “Our generation’s troubles are valid and growing. Buzzfeed News so aptly describes it as a ‘generation free-fall.’ So pick up your video game console. Load in Animal Crossing. Play the game. For Generation Z, Animal Crossing is hope, and it will save us all.” So my advice to any young writers wanting to participate in the contest is don’t feel pressured to choose a topic that doesn’t feel like you. You don’t have to write about political events that you don’t care about. You don’t have to write about economic trends in the stock market. Talk about things that matter to you. I wrote about video games. And I use simple language, but I’m more proud of writing this than any academic paper I’ve written for school.

Video player loading

As I watched these students in the series so thoughtfully dissect their pieces time and again, I also began to wonder why we don’t regularly have students annotate their own work in the classroom. Suggesting that students annotate, or talk to, texts as they read is commonplace, but before the Annotated by the Author series, I’d never seen someone ask students to annotate their own writing. Then, last winter, after reading Maria Fernanda Benavides’s particularly insightful explanation of how she shifted her sentence structure to match her emotions in her narrative “Speechless,” I decided to try having my students annotate their own writing, and I haven’t looked back since.

The Benefits of Annotation for Students and for Teachers

For students, the potential positives of unpacking and explaining their own writing were instantly apparent and significant. These are some of the common advantages I found:

Annotation develops metacognition . The act of annotation is the very definition of metacognition , which is when students think about their own thinking and processes. Engaging in this sort of metacognition has been shown to significantly improve student learning outcomes , in part because it requires students to actively engage in monitoring their own growth instead of relying on the teacher to do it for them.

It positions students as active, serious participants in their own writing growth . Regular annotation of their work also recognizes students as purposeful writers and decision makers who have something to say about their craft, which is very different from how student writers are often approached. This recognition can be both empowering and motivating, especially for students who have often felt that their voices weren’t heard by those around them.

And it makes students better readers. Annotating and unpacking their work can act as a safe training ground for students to learn to better dissect and discuss the work of others in workshops and peer review.

For teachers, student annotation can be equally useful, as it opens up the following opportunities:

It helps us to see students’ thinking. Annotation allows teachers a glimpse into the students’ inner monologues about writing. These monologues can help teachers better plan and calibrate lessons so they meet the needs of students.

It allows us to give more targeted feedback. Teachers can be more precise and responsive when providing feedback to and conferencing with students when that writing has annotations because they allow the teacher to see the student’s mind-set, process, understanding and motivations, and allow the teacher to respond accordingly.

And it reduces our workload. Annotation helps students to more accurately self-assess their work, which can save teachers significant amounts of time when it comes to assessment, even as it helps students better understand and chart their own learning journey.

Three Ways to Have Students Annotate Their Work

Once one starts to look for them, there are numerous places where student annotations of their writing might yield such positive results — so many that I feel I am just scratching the surface. Still, over the last year, I’ve found some particular areas where they’ve made the biggest difference in my classes:

Short, Skills-Focused Assignments

Much of my grammar and rhetoric instruction involves students writing shorter papers where they use a certain grammatical and rhetorical skill in the context of their own writing. I’ve found this type of grammar instruction to be far more effective than the grammar worksheets I used to do, but for many years I also found it more time-consuming to read and assess those extra papers.

This all changed, though, when students started annotating the choices they made. For example, in my class, we do a short unit on the grammatical tools writers can use to add emphasis (colons, dashes, appositives, parallel structure, purposeful fragments and so on). To assess their understanding of these “emphasizers,” I have my students write a rant on any topic that they want and then use the comments feature on Google Docs to explain how, when and why they used the tools we discussed in class.

By using the highlighted comments as a guide, I can now assess these pieces faster because I know exactly where to look. I can also assess them better because I can see in students’ own words how well they understand the concept.

Pre-feedback Moments

Feedback, whether it is from teachers or peers, tends to be a one-way street where the reader responds to the writer and then the conversation largely stops. I have found that while that approach can yield some growth, both peer and teacher responses often have a far larger impact when they are true conversations, especially when they are initiated by the author.

This is why I now have my students write annotations before getting peer and teacher responses to let the reader know what they are thinking, questioning and needing. Here is how I prompt them to do that:

These annotations don’t take long, but they often add a great deal — acting as icebreakers for conversation, ensuring that students get the help they need, and establishing a clear foundation from which both parties can work as collaborators toward improving the student’s piece.

Final Draft Self-Evaluations

More and more educators are growing interested in the idea of having students do meaningful self-assessment of their work in class. Self-assessment adds an additional layer of reflection and metacognition, and it can free up teachers to give feedback in the formative, or early, stages of student work, where it is most effective. Further, students assessing their work first can act as a bulwark against the possibility that students will feel blindsided or injured by grades and assessments because the teacher can see how they feel about their work first.

The trouble with self-assessment is that many students are unaccustomed to doing it, which can lead to problems with accuracy and students feeling unsure about how to evaluate themselves. Requiring students to use annotations to support their specific assessments can help with both of these issues: The act of finding and explaining the scores means that they need to be grounded in evidence, and the very act of looking for that evidence can help to train students in how to better assess themselves.

Here is the slide I use to prompt these kinds of self-assessments:

____________

Annotation can be a potent tool for helping students become better and more savvy readers, so it makes sense that it would also be a potent tool for helping students to become better and more savvy writers. The secret I’ve found to using it, though, is that the annotation needs to be meaningful. As soon as it feels more like a hoop to be jumped through, as can sometimes happen with misapplied classroom-required annotations during reading, all of the advantages of annotating their own work vanish in an instant.

This is why I explain much of what I share above with my students as a way to make a case for the value of annotating one’s own writing. It is also why I now use the essays from The Learning Network’s Annotated by the Author series both as mentor texts for the craft of writing and for the craft of learning how to dissect one’s own work.

Because when it serves a thoughtful purpose, student annotation is one of the most exciting pedagogical tools I’ve found in a long time — one that opens students up to what revision and writing can be, opens up the teacher to providing better and faster feedback and assessment, and generally opens up powerful lines of communication between both parties that often lie dormant.

Understanding & Interacting with a Text

Annotations, definition and purpose.

annotations good practice

Annotating literally means taking notes within the text as you read.  As you annotate, you may combine a number of reading strategies—predicting, questioning, dealing with patterns and main ideas, analyzing information—as you physically respond to a text by recording your thoughts.  Annotating may occur on a first or second reading of the text, depending on the text’s difficulty or length. You may annotate in different formats, either in the margins of the text or in a separate notepad or document. The main thing to remember is that annotation is at the core of active reading. By reading carefully and pausing to reflect upon, mark up, and add notes to a text as you read, you can greatly improve your understanding of that text.

Think of annotating a text in terms of having a conversation with the author in real time. You wouldn’t sit passively while the author talked at you. You wouldn’t be able to get clarification or ask questions.  Your thought processes would probably close down and you would not engage in thinking about larger meanings related to the topic. Conversation works best when people are active participants. Annotation is a form of active involvement with a text.

Reasons to Annotate

There are a number of reasons to annotate a text:

  • Annotation ultimately saves reading time. While it may take more time up front as you read, annotating while you read can help you avoid having to re-read passages in order to get the meaning. That’s because…
  • Annotation improves understanding. By pausing to reflect as you read, annotating a text helps you figure out if you’re understanding what you’re reading. If not, you can immediately re-read or seek additional information to improve your understanding. This is called “monitoring comprehension.”
  • Annotation increases your odds of remembering what you’ve read, because you write those annotations in your own words, making the information your own. You also leave behind a set of notes that can help you find key information the next time you need to refer to that text.
  • Annotation provides a record of your deeper questions and thoughts as you read, insights related to analyzing, interpreting, and going beyond the text into related issues.  Annotations such as these will be useful when you’re asked to respond to a text through reacting, applying, analyzing, and synthesizing, since these types of annotations record your own thoughts.  Much academic work in college is intended to get you to offer your own, informed thoughts (as opposed to simple recall and regurgitation of information); annotating a text helps you capture key personal, analytical insights as you read.

The following video offers a brief, clear example of annotating a text.

What to Annotate

You’ll find that you’re annotating differently in different texts, depending on your background knowledge of the topic, your own ease with reading the text, and the type of text, among other variables.  There’s no single formula for annotating a text.  Instead, there are different types of annotations that you may make, depending on the particular text.

  • Mark the thesis or main idea sentence, if there is one in the text.  Or note the implied main idea.  In either case, phrase that main idea in your own words.
  • Mark places that seem important, interesting, and/or confusing.
  • Note your agreement or disagreement with an idea in the text.
  • Link a concept in the text to your own experience.
  • Write a reminder to look up something – an unknown word, a difficult concept, or a related idea that occurred to you.
  • Record questions you have about what you are reading. These questions generally fall into two different categories, to clarify meaning and to evaluate what you’ve read.
  • Note any biases unstated assumptions (your own included).
  • Paraphrase a difficult passage by putting it into your own words.
  • Summarize a lengthy section of a text to extract the main ideas–again in your own words.
  • Note important transition words that show a shift in thought; transitions show how the author is linking ideas.  This is especially important if you’re reading and annotating a text intended to persuade the reader to a particular point of view, as it allows you to clarify and evaluate the author’s line of reasoning.
  • Note repeated words or phrases; it’s likely that such emphasis relates to a key concept or main idea.
  • Note the writer’s tone—straightforward, sarcastic, sincere, witty—and how it influences the ideas presented.
  • Note idea linkages between this text and another text.
  • Note idea linkages between this text and key concepts or theories of a discipline. For example, does the author offer examples relating to theories of motivation that you’re studying in a psychology class?
  • And more…again, annotations vary according to the text and your background in the text’s topic.

View the following video, which reviews reading strategies for approximately the first three minutes and then moves into a comprehensive discussion of the types of things to annotate in non-fiction texts.

How to Annotate

Make sure to annotate through writing.  Do not – do not –  simply highlight or underline existing words in the text.  While your annotations may start with a few underlined words or sentences, you should always complete your thoughts through a written annotation that identifies why you underlined those words (e.g., key ideas, your own reaction to something, etc.). The pitfall of highlighting is that readers tend to do it too much, and then have to go back to the original text and re-read most of it.  By writing annotations in your own words, you’ve already moved to a higher level in your conversation with the text.

If you don’t want to write in a margin of a book or article, use sticky notes for your annotations.  If the text is in electronic form, then the format itself may have built-in annotation tools, or write in a Word document which allows you to paste sentences and passages that you want to annotate.

You may also want to create your own system of symbols to mark certain things such as main idea (*), linkage to ideas in another text (+), confusing information that needs to be researched further (!), or similar idea (=). The symbols and marks should make sense to you, and you should apply them consistently from text to text, so that they become an easy shorthand for annotation. However, annotations should not consist of symbols only; you need to include words to remember why you marked the text in that particular place.

Above all, be selective about what to mark; if you end up annotating most of a page or even most of a paragraph, nothing will stand out, and you will have defeated the purpose of annotating.

Here’s one brief example of annotation:

Sample Annotation

What follows is a sample annotation of the first few paragraphs of an article from CNN, “One quarter of giant panda habitat lost in Sichuan quake,” July 29, 2009. Sample annotations are in color. 

“The earthquake in Sichuan, southwestern China, last May left around 69,000 people dead and 15 million people displaced. Now ecologists have assessed the earthquake’s impact on biodiversity look this word up and the habitat for some of the last existing wild giant pandas.

According to the report published in “Frontiers in Ecology and the Environment,” 23 percent of the pandas’ habitat in the study area was destroyed, and fragmentation of the remaining habitat could hinder panda reproduction. How was this data gathered? Do we know that fragmentation will hinder panda reproduction?  

The Sichuan region is designated as a global hotspot for biodiversity, according to Conservation International. Home to more than 12,000 species of plants and 1.122 species of vertebrates, the area includes more than half of the habitat for the Earth’s wild giant panda population, said study author Weihua Xu of the Chinese Academy of Sciences in Beijing.” So can we assume that having so much of the pandas/ habitat destroyed will impact other species here?

Link to two additional examples of what and how to annotate

  • Invention: Annotating a Text from Hunter College, included as a link in Maricopa Community College’s Reading 100 open educational resource. There’s a very clearly-annotated sample text at the end of this handout.
  • Ethnic Varieties by Walt Wolfram, included as a link in Let’s Get Writing.

Summary: Annotation = Making Connections

The video below offers a review of reading concepts in the first part, focused on the concept of reading as connecting with a text.  From approximately mid-way to the end, the video offers a good extended example and discussion of annotating a text.

Note: if you want to try annotating an article and find the one in the video difficult to read, you may want to practice on a similar article about the same topic, “ Tinker V. Des Moines Independent Community School District: Kelly Shackelford on Symbolic Speech ” on the blog of the U.S. Supreme Court.

Read the paragraphs from “ Cultural Relativism ” that deal with the sociological perspective. Annotate the paragraphs with insights, questions, and thoughts that occur to you as you read.

  • Annotations, includes material adapted from Excelsior College Online Reading Lab, Let's Get Writing, UMRhetLab, Reading 100, and Basic Reading and Writing; attributions below. Authored by : Susan Oaks. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Annotating: Creating an Annotation System. Provided by : Excelsior College. Located at : https://owl.excelsior.edu/orc/what-to-do-while-reading/annotating/annotating-creating-an-annotation-system/ . Project : Excelsior College Online Reading lab. License : CC BY: Attribution
  • Chapter 1 - Critical Reading. Authored by : Elizabeth Browning. Provided by : Virginia Western Community College. Located at : https://vwcceng111.pressbooks.com/chapter/chapter-1-critical-reading/#whileyouread . Project : Let's Get Writing. License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Strategies for Active Reading. Authored by : Guy Krueger.. Provided by : University of Mississippi. Located at : https://courses.lumenlearning.com/olemiss-writing100/chapter/strategies-for-active-reading/ . Project : UMRhetLab. License : CC BY: Attribution
  • Annotating a Text (from Hunter College). Provided by : Maricopa Community College. Located at : https://learn.maricopa.edu/courses/904536/files/32965647?module_item_id=7199522 . Project : Reading 100. License : CC BY: Attribution
  • Summary Skills. Provided by : Lumen Learning. Located at : https://courses.lumenlearning.com/suny-basicreadingwriting/chapter/outcome-summary-skills/ . Project : Basic Reading and Writing. License : CC BY: Attribution
  • image of open book with colored tabs and colored pencils. Authored by : Luisella Planeta . Provided by : Pixabay. Located at : https://pixabay.com/photos/books-pencils-pens-map-dictionary-3826148/ . License : CC0: No Rights Reserved
  • video Textbook Reading Strategies - Annotate the Text. Authored by : DistanceLearningKCC. Provided by : Kirkwood Community College. Located at : https://www.youtube.com/watch?v=bE1ot8KWJrk . License : Other . License Terms : YouTube video
  • video Annotating Non-Fiction Texts. Authored by : Arri Weeks. Located at : https://www.youtube.com/watch?v=QrvNIVF9EbI . License : Other . License Terms : YouTube video
  • video Making Connections During Reading. Provided by : WarnerJordanEducation. Located at : https://www.youtube.com/watch?v=hF54mvmFkxg . License : Other . License Terms : YouTube video

Footer Logo Lumen Candela

Privacy Policy

ABLE blog: thoughts, learnings and experiences

  • Productivity
  • Thoughtful learning

Annotating text: The complete guide to close reading

Annotating text: The complete guide to close reading

As students, researchers, and self-learners, we understand the power of reading and taking smart notes . But what happens when we combine those together? This is where annotating text comes in.

Annotated text is a written piece that includes additional notes and commentary from the reader. These notes can be about anything from the author's style and tone to the main themes of the work. By providing context and personal reactions, annotations can turn a dry text into a lively conversation.

Creating text annotations during close readings can help you follow the author's argument or thesis and make it easier to find critical points and supporting evidence. Plus, annotating your own texts in your own words helps you to better understand and remember what you read.

This guide will take a closer look at annotating text, discuss why it's useful, and how you can apply a few helpful strategies to develop your annotating system.

What does annotating text mean?

Annotating text: yellow pen and a yellow notebook

Text annotation refers to adding notes, highlights, or comments to a text. This can be done using a physical copy in textbooks or printable texts. Or you can annotate digitally through an online document or e-reader.

Generally speaking, annotating text allows readers to interact with the content on a deeper level, engaging with the material in a way that goes beyond simply reading it. There are different levels of annotation, but all annotations should aim to do one or more of the following:

  • Summarize the key points of the text
  • Identify evidence or important examples
  • Make connections to other texts or ideas
  • Think critically about the author's argument
  • Make predictions about what might happen next

When done effectively, annotation can significantly improve your understanding of a text and your ability to remember what you have read.

What are the benefits of annotation?

There are many reasons why someone might wish to annotate a document. It's commonly used as a study strategy and is often taught in English Language Arts (ELA) classes. Students are taught how to annotate texts during close readings to identify key points, evidence, and main ideas.

In addition, this reading strategy is also used by those who are researching for self-learning or professional growth. Annotating texts can help you keep track of what you’ve read and identify the parts most relevant to your needs. Even reading for pleasure can benefit from annotation, as it allows you to keep track of things you might want to remember or add to your personal knowledge management system .

Annotating has many benefits, regardless of your level of expertise. When you annotate, you're actively engaging with the text, which can help you better understand and learn new things . Additionally, annotating can save you time by allowing you to identify the most essential points of a text before starting a close reading or in-depth analysis.

There are few studies directly on annotation, but the body of research is growing. In one 2022 study, specific annotation strategies increased student comprehension , engagement, and academic achievement. Students who annotated read slower, which helped them break down texts and visualize key points. This helped students focus, think critically , and discuss complex content.

Annotation can also be helpful because it:

  • Allows you to quickly refer back to important points in the text without rereading the entire thing
  • Helps you to make connections between different texts and ideas
  • Serves as a study aid when preparing for exams or writing essays
  • Identifies gaps in your understanding so that you can go back and fill them in

The process of annotating text can make your reading experience more fruitful. Adding comments, questions, and associations directly to the text makes the reading process more active and enjoyable.

annotations good practice

Be the first to try it out!

We're developing ABLE, a powerful tool for building your personal knowledge, capturing information from the web, conducting research, taking notes, and writing content.

How do you annotate text?

2 pens and 2 notebooks

There are many different ways to annotate while reading. The traditional method of annotating uses highlighters, markers, and pens to underline, highlight, and write notes in paper books. Modern methods have now gone digital with apps and software. You can annotate on many note-taking apps, as well as online documents like Google Docs.

While there are documented benefits of handwritten notes, recent research shows that digital methods are effective as well. Among college students in an introductory college writing course, those with more highlighting on digital texts correlated with better reading comprehension than those with more highlighted sections on paper.

No matter what method you choose, the goal is always to make your reading experience more active, engaging, and productive. To do so, the process can be broken down into three simple steps:

  • Do the first read-through without annotating to get a general understanding of the material.
  • Reread the text and annotate key points, evidence, and main ideas.
  • Review your annotations to deepen your understanding of the text.

Of course, there are different levels of annotation, and you may only need to do some of the three steps. For example, if you're reading for pleasure, you might only annotate key points and passages that strike you as interesting or important. Alternatively, if you're trying to simplify complex information in a detailed text, you might annotate more extensively.

The type of annotation you choose depends on your goals and preferences. The key is to create a plan that works for you and stick with it.

Annotation strategies to try

When annotating text, you can use a variety of strategies. The best method for you will depend on the text itself, your reason for reading, and your personal preferences. Start with one of these common strategies if you don't know where to begin.

  • Questioning: As you read, note any questions that come to mind as you engage in critical thinking . These could be questions about the author's argument, the evidence they use, or the implications of their ideas.
  • Summarizing: Write a brief summary of the main points after each section or chapter. This is a great way to check your understanding, help you process information , and identify essential information to reference later.
  • Paraphrasing: In addition to (or instead of) summaries, try paraphrasing key points in your own words. This will help you better understand the material and make it easier to reference later.
  • Connecting: Look for connections between different parts of the text or other ideas as you read. These could be things like similarities, contrasts, or implications. Make a note of these connections so that you can easily reference them later.
  • Visualizing: Sometimes, it can be helpful to annotate text visually by drawing pictures or taking visual notes . This can be especially helpful when trying to make connections between different ideas.
  • Responding: Another way to annotate is to jot down your thoughts and reactions as you read. This can be a great way to personally engage with the material and identify any areas you need clarification on.

Combining the three-step annotation process with one or more strategies can create a customized, powerful reading experience tailored to your specific needs.

ABLE: Zero clutter, pure flow

Carry out your entire learning, reflecting and writing process from one single, minimal interface. Focus modes for reading and writing make concentrating on what matters at any point easy.

7 tips for effective annotations

HIGHLIGHT spelled using letter tiles

Once you've gotten the hang of the annotating process and know which strategies you'd like to use, there are a few general tips you can follow to make the annotation process even more effective.

1. Read with a purpose. Before you start annotating, take a moment to consider what you're hoping to get out of the text. Do you want to gain a general overview? Are you looking for specific information? Once you know what you're looking for, you can tailor your annotations accordingly.

2. Be concise. When annotating text, keep it brief and focus on the most important points. Otherwise, you risk annotating too much, which can feel a bit overwhelming, like having too many tabs open . Limit yourself to just a few annotations per page until you get a feel for what works for you.

3. Use abbreviations and symbols. You can use abbreviations and symbols to save time and space when annotating digitally. If annotating on paper, you can use similar abbreviations or symbols or write in the margins. For example, you might use ampersands, plus signs, or question marks.

4. Highlight or underline key points. Use highlighting or underlining to draw attention to significant passages in the text. This can be especially helpful when reviewing a text for an exam or essay. Try using different colors for each read-through or to signify different meanings.

5. Be specific. Vague annotations aren't very helpful. Make sure your note-taking is clear and straightforward so you can easily refer to them later. This may mean including specific inferences, key points, or questions in your annotations.

6. Connect ideas. When reading, you'll likely encounter ideas that connect to things you already know. When these connections occur, make a note of them. Use symbols or even sticky notes to connect ideas across pages. Annotating this way can help you see the text in a new light and make connections that you might not have otherwise considered.

7. Write in your own words. When annotating, copying what the author says verbatim can be tempting. However, it's more helpful to write, summarize or paraphrase in your own words. This will force you to engage your information processing system and gain a deeper understanding.

These tips can help you annotate more effectively and get the most out of your reading. However, it’s important to remember that, just like self-learning , there is no one "right" way to annotate. The process is meant to enrich your reading comprehension and deepen your understanding, which is highly individual. Most importantly, your annotating system should be helpful and meaningful for you.

Engage your learning like never before by learning how to annotate text

Learning to effectively annotate text is a powerful tool that can improve your reading, self-learning , and study strategies. Using an annotating system that includes text annotations and note-taking during close reading helps you actively engage with the text, leading to a deeper understanding of the material.

Try out different annotation strategies and find what works best for you. With practice, annotating will become second nature and you'll reap all the benefits this powerful tool offers.

I hope you have enjoyed reading this article. Feel free to share, recommend and connect 🙏

Connect with me on Twitter 👉   https://twitter.com/iamborisv

And follow Able's journey on Twitter: https://twitter.com/meet_able

And subscribe to our newsletter to read more valuable articles before it gets published on our blog.

Now we're building a Discord community of like-minded people, and we would be honoured and delighted to see you there.

Boris

Straight from the ABLE team: how we work and what we build. Thoughts, learnings, notes, experiences and what really matters.

Read more posts by this author

follow me :

Learning with a cognitive approach: 5 proven strategies to try

What is knowledge management the answer, plus 9 tips to get started.

Managing multiple tabs: how ABLE helps you tackle tab clutter

Managing multiple tabs: how ABLE helps you tackle tab clutter

What is abstract thinking? 10 activities to improve your abstract thinking skills

What is abstract thinking? 10 activities to improve your abstract thinking skills

0 results found.

  • Aegis Alpha SA
  • We build in public

Building with passion in

  • Our Mission

More Than Highlighting: Creative Annotations

Active strategies for annotation like collaborative work and illustration increase students’ comprehension and retention.

A page of a spiral notebook listing the elements of a tragedy with various drawings and doodles

Annotating texts is not the most exciting tactic for reading comprehension. In my classroom experience, even the mention of the word annotate  was met with looks of confusion or boredom. Traditional annotations have been students’ only interactions with the text. When students are asked to underline important parts of the texts, they will usually pick the first line that seems appealing or attempt to highlight the whole page of text with pretty-colored highlighters. Simply underlining the text will not meet the needs of our 21st-century learners.

Annotations are a critical strategy teachers can use to encourage students to interact with a text. They promote a deeper understanding of passages and encourage students to read with a purpose. Teachers can use annotations to emphasize crucial literacy skills like visualization, asking questions, and making inferences.

Purposeful instruction with annotating texts is required for students to benefit from this strategy. Focused instructional activities associated with annotation make the process engaging. Teachers can encourage students to participate in the annotation in new ways that use visual and collaborative strategies.

Illustrated Annotations

Illustrated annotations use images to increase comprehension and understanding. Students create illustrations to represent concepts and elements of literature. Prior to reading the text, the students create a visual representation or symbol for the concept or element of focus for the learning target. When the students annotate the text, they use the illustration they created.

I recently used this strategy to teach Hamlet . Specifically, we focused on the seven elements of Shakespearean tragedies. Before reading the texts, students drew visuals or symbols of each element. Students could choose any illustration that enhanced their learning. Those who were not adept at art could draw a “TF” for tragic flaw. After the students created their illustrations, I selected chunks of the texts for the students to annotate throughout our reading of the play.

The process of creating an illustration helps students synthesize information and increases student engagement and creativity. It makes annotating texts a more hands-on experience and makes their learning meaningful and personal. One challenge with this assignment occurs when students believe they cannot draw, do not have artistic talent, or are not creative. Allowing less artistic students to use symbols or simple drawings also emphasizes the importance of student choice. The purpose of the assignment is to capture the symbolism of concepts, so they can create any marking that represents their perception and understanding of a concept.

A printout of Hamlet's 'to be or not to be' soliloquy, marked with student notes and drawings

Collaborative Annotations

Another annotation strategy is collaborative annotation, or an annotation on a shared text by multiple students. Students annotate the same text and analyze each person’s annotations to find inspiration, discover similarities, or ask questions.

Students were given guided analysis prompts while annotating the text and their peers’ responses. During this lesson, the students were instructed to write two extended comments and pose one question per page of text. The next set of students had to do the same, but they could comment on the text or a previous annotation from another student. Each class was able to view and analyze the annotations of their peers from previous classes. At the end, students had a collection of annotations that showed several different processes of reading a text.

This strategy encourages students to close-read a text. Students think critically and have a deeper and more meaningful understanding of the text. Students also collaborate and communicate about a text with their peers by commenting and questioning the marks of others.

Personalizing the Process

Annotation strategies can be differentiated for learners in a single classroom by adjusting the requirements for each reading. Learning targets for the annotation activities can be modified for different learning needs.

Digital applications may be used in several different ways. In order to facilitate collaborative annotations in a digital format, teachers can use Google Docs. Students analyze the same text and leave comments or highlight portions of the text. Students can easily share documents and comment on other students’ annotations. For visual annotations, teachers can use graphic tools like Adobe Spark. Students can pull parts of the texts and choose pictures to represent their interpretations.

Teachers in any content area can use these annotation strategies for any texts in the class to emphasize certain themes or to promote literacy in their classes. Creativity and collaboration are crucial to 21st-century learners. When creative annotating strategies facilitate student interaction with a text, the annotation process is a meaningful learning experience and not just a coloring page with meaningless highlights.

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

1 About Annotation (and an opportunity to practice)

Two pages of a book where there are numerous hand written notes in the margins, overlapping the text, and underlined passafes.

Annotation , the act of adding additional information as a note attached to a specific part of a published work (or simply highlighting key passages), is a familiar academic but also everyday practice.

As described in Remi Kalir and Antero Garcia’s book,  Annotation

Annotation provides information, making knowledge more accessible. Annotation shares commentary, making both expert opinion and everyday perspective more transparent. Annotation sparks conversation, making our dialogue – about art, religion, culture, politics, and research – more interactive. Annotation expresses power, making civic life more robust and participatory. And annotation aids learning, augmenting our intellect, cognition, and collaboration. This is why annotation matters.

Making notes in printed works is a centuries old practice, the authors share some historical examples.

Marginalia thrived in England during the sixteenth century, as studies of book culture during the rule of Elizabeth I and James I demonstrate.Annotated books were routinely exchanged among scholars and friends as “social activity” throughout the Victorian era.  Some of the most significant commentary about the Talmud, first written in the eleventh century, has been featured prominently as annotation in print editions since the early 1500s. Today, scientists’ annotation of the human genome and proteome for large-scale biomedical research relies upon techniques that are both similar to and also very different from linguists and historians who have translated, annotated, and digitally archived Babylonian and Assyrian clay tablets. From the annotatio of Roman imperial law to the medieval gloss , annotation nowadays helps people to write computer code, evaluate chess games, and interpret rap lyrics.

Kalir and Garcia offer examples of every day acts such as charting a child’s growth on a doorway, adding notes to a family recipe card, even creating meme images, that all are acts of annotation.  Can you think of your own everyday activities that might be considered annotation? Where do you see it in the world around you?

Web annotation not only provides a similar functionality, but expands its capabilities by having it take place in an open, common space making it a social process. With the open source platform Hypothesis , we all can add commentary, questions, additional resources to any public content on the web.

If this is new to you, we can start right here with some practice annotation.

Annotate Now With Hypothesis!

Hypothes.is  is a free, open-source social annotation technology regularly used by educators. It adds an annotation layer to any public web page or document. To participate in social annotation conversations, start by  creating a free Hypothesis account .

The entire sign-up process will take less than one minute. In fact you will be able to log in or create an account directly from within this Pressbook.

Please note that Hypothesis is not a social network and does not collect any personally identified information except for an email address. Additionally all public annotations by default are licensed Creative Commons CC0.

This page of this book is already set up to be annotated with Hypothesis. How do you know? Look in the upper right corner  a gray button with a  <  symbol.

The page of this book is open with an arrow pointing to the Hypothes.is tool button

Upon opening the Hypthesis tools via the < button you see the notes previously added. If you are not logged in to Hypothesis, you can do it here or even create a new account. Once you are logged in, you can even reply to an existing annotation.

All annotations available in this page will be indicated by a yellow highlight color. Click any annotation to read it in the Hypothesis sidebar.

How do we create a new note? Let’s annotate this page, maybe in the area that mentions every day acts of annotation, or where we see it in the real world. How does one annotate? Just select a phrase or word that you would like add information to (shorter selections of text work better). Hypothesis offers right there a choice to add a note as annotation.

The words -even creating meme images -- are selected and the tool buttons appear aboce it, Annotate and Highlight

Choosing Annotate opens a small composition window in the Hypothesis sidebar. You can add text, create hyperlinks, and insert images or media.

Use this page as a place to practice writing annotations.  Once you have an understanding of the process, continue to the next chapter where we provide examples of how to annotate  the Recommendation on OER.

Annotating the UNESCO Recommendation on OER Copyright © 2021 by UNESCO is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Citing sources
  • What Is an Annotated Bibliography? | Examples & Format

What Is an Annotated Bibliography? | Examples & Format

Published on March 9, 2021 by Jack Caulfield . Revised on August 23, 2022.

An annotated bibliography is a list of source references that includes a short descriptive text (an annotation) for each source. It may be assigned as part of the research process for a paper , or as an individual assignment to gather and read relevant sources on a topic.

Scribbr’s free Citation Generator allows you to easily create and manage your annotated bibliography in APA or MLA style. To generate a perfectly formatted annotated bibliography, select the source type, fill out the relevant fields, and add your annotation.

An example of an annotated source is shown below:

Annotated source example

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

Annotated bibliography format: apa, mla, chicago, how to write an annotated bibliography, descriptive annotation example, evaluative annotation example, reflective annotation example, finding sources for your annotated bibliography, frequently asked questions about annotated bibliographies.

Make sure your annotated bibliography is formatted according to the guidelines of the style guide you’re working with. Three common styles are covered below:

In APA Style , both the reference entry and the annotation should be double-spaced and left-aligned.

The reference entry itself should have a hanging indent . The annotation follows on the next line, and the whole annotation should be indented to match the hanging indent. The first line of any additional paragraphs should be indented an additional time.

APA annotated bibliography

In an MLA style annotated bibliography , the Works Cited entry and the annotation are both double-spaced and left-aligned.

The Works Cited entry has a hanging indent. The annotation itself is indented 1 inch (twice as far as the hanging indent). If there are two or more paragraphs in the annotation, the first line of each paragraph is indented an additional half-inch, but not if there is only one paragraph.

MLA annotated bibliography

Chicago style

In a  Chicago style annotated bibliography , the bibliography entry itself should be single-spaced and feature a hanging indent.

The annotation should be indented, double-spaced, and left-aligned. The first line of any additional paragraphs should be indented an additional time.

Chicago annotated bibliography

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

For each source, start by writing (or generating ) a full reference entry that gives the author, title, date, and other information. The annotated bibliography format varies based on the citation style you’re using.

The annotations themselves are usually between 50 and 200 words in length, typically formatted as a single paragraph. This can vary depending on the word count of the assignment, the relative length and importance of different sources, and the number of sources you include.

Consider the instructions you’ve been given or consult your instructor to determine what kind of annotations they’re looking for:

  • Descriptive annotations : When the assignment is just about gathering and summarizing information, focus on the key arguments and methods of each source.
  • Evaluative annotations : When the assignment is about evaluating the sources , you should also assess the validity and effectiveness of these arguments and methods.
  • Reflective annotations : When the assignment is part of a larger research process, you need to consider the relevance and usefulness of the sources to your own research.

These specific terms won’t necessarily be used. The important thing is to understand the purpose of your assignment and pick the approach that matches it best. Interactive examples of the different styles of annotation are shown below.

A descriptive annotation summarizes the approach and arguments of a source in an objective way, without attempting to assess their validity.

In this way, it resembles an abstract , but you should never just copy text from a source’s abstract, as this would be considered plagiarism . You’ll naturally cover similar ground, but you should also consider whether the abstract omits any important points from the full text.

The interactive example shown below describes an article about the relationship between business regulations and CO 2 emissions.

Rieger, A. (2019). Doing business and increasing emissions? An exploratory analysis of the impact of business regulation on CO 2 emissions. Human Ecology Review , 25 (1), 69–86. https://www.jstor.org/stable/26964340

An evaluative annotation also describes the content of a source, but it goes on to evaluate elements like the validity of the source’s arguments and the appropriateness of its methods .

For example, the following annotation describes, and evaluates the effectiveness of, a book about the history of Western philosophy.

Kenny, A. (2010). A new history of Western philosophy: In four parts . Oxford University Press.

Scribbr Citation Checker New

The AI-powered Citation Checker helps you avoid common mistakes such as:

  • Missing commas and periods
  • Incorrect usage of “et al.”
  • Ampersands (&) in narrative citations
  • Missing reference entries

annotations good practice

A reflective annotation is similar to an evaluative one, but it focuses on the source’s usefulness or relevance to your own research.

Reflective annotations are often required when the point is to gather sources for a future research project, or to assess how they were used in a project you already completed.

The annotation below assesses the usefulness of a particular article for the author’s own research in the field of media studies.

Manovich, Lev. (2009). The practice of everyday (media) life: From mass consumption to mass cultural production? Critical Inquiry , 35 (2), 319–331. https://www.jstor.org/stable/10.1086/596645

Manovich’s article assesses the shift from a consumption-based media culture (in which media content is produced by a small number of professionals and consumed by a mass audience) to a production-based media culture (in which this mass audience is just as active in producing content as in consuming it). He is skeptical of some of the claims made about this cultural shift; specifically, he argues that the shift towards user-made content must be regarded as more reliant upon commercial media production than it is typically acknowledged to be. However, he regards web 2.0 as an exciting ongoing development for art and media production, citing its innovation and unpredictability.

The article is outdated in certain ways (it dates from 2009, before the launch of Instagram, to give just one example). Nevertheless, its critical engagement with the possibilities opened up for media production by the growth of social media is valuable in a general sense, and its conceptualization of these changes frequently applies just as well to more current social media platforms as it does to Myspace. Conceptually, I intend to draw on this article in my own analysis of the social dynamics of Twitter and Instagram.

Before you can write your annotations, you’ll need to find sources . If the annotated bibliography is part of the research process for a paper, your sources will be those you consult and cite as you prepare the paper. Otherwise, your assignment and your choice of topic will guide you in what kind of sources to look for.

Make sure that you’ve clearly defined your topic , and then consider what keywords are relevant to it, including variants of the terms. Use these keywords to search databases (e.g., Google Scholar ), using Boolean operators to refine your search.

Sources can include journal articles, books, and other source types , depending on the scope of the assignment. Read the abstracts or blurbs of the sources you find to see whether they’re relevant, and try exploring their bibliographies to discover more. If a particular source keeps showing up, it’s probably important.

Once you’ve selected an appropriate range of sources, read through them, taking notes that you can use to build up your annotations. You may even prefer to write your annotations as you go, while each source is fresh in your mind.

An annotated bibliography is an assignment where you collect sources on a specific topic and write an annotation for each source. An annotation is a short text that describes and sometimes evaluates the source.

Any credible sources on your topic can be included in an annotated bibliography . The exact sources you cover will vary depending on the assignment, but you should usually focus on collecting journal articles and scholarly books . When in doubt, utilize the CRAAP test !

Each annotation in an annotated bibliography is usually between 50 and 200 words long. Longer annotations may be divided into paragraphs .

The content of the annotation varies according to your assignment. An annotation can be descriptive, meaning it just describes the source objectively; evaluative, meaning it assesses its usefulness; or reflective, meaning it explains how the source will be used in your own research .

A source annotation in an annotated bibliography fulfills a similar purpose to an abstract : they’re both intended to summarize the approach and key points of a source.

However, an annotation may also evaluate the source , discussing the validity and effectiveness of its arguments. Even if your annotation is purely descriptive , you may have a different perspective on the source from the author and highlight different key points.

You should never just copy text from the abstract for your annotation, as doing so constitutes plagiarism .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Caulfield, J. (2022, August 23). What Is an Annotated Bibliography? | Examples & Format. Scribbr. Retrieved February 27, 2024, from https://www.scribbr.com/citing-sources/annotated-bibliography/

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, evaluating sources | methods & examples, how to find sources | scholarly articles, books, etc., hanging indent | word & google docs instructions, scribbr apa citation checker.

An innovative new tool that checks your APA citations with AI software. Say goodbye to inaccurate citations!

Data Annotation in 2024: Why it matters & Top 8 Best Practices

annotations good practice

Annotated data is an integral part of various machine learning, artificial intelligence (AI) and GenAI applications. It is also one of the most time-consuming and labor-intensive parts of AI/ML projects. Data annotation is one of the top limitations of AI implementation for organizations. Whether you work with an AI data service , or perform annotation in-house, you need to get this process right.

Tech leaders and developers need to focus on improving data annotation for their data-hungry digital solutions. To remedy that, we recommend an in-depth understanding of data annotation.

Our research covers the following:

What is data annotation?

  • Why it matters?
  • What its techniques/types are?
  • What are some key challenges of annotating data?
  • What are some best practices for data annotation?

Data annotation is the process of labeling data with relevant tags to make it easier for computers to understand and interpret. This data can be in the form of images, text, audio, or video, and data annotators need to label it as accurately as possible. Data annotation can be done manually by a human or automatically using advanced machine learning algorithms and tools. Learn more about automated data annotation.

For supervised machine learning, labeled datasets are crucial because ML models need to understand input patterns to process them and produce accurate results. Supervised ML models (see figure 1) train and learn from correctly annotated data and solve problems such as:

  • Classification: Assigning test data into specific categories. For instance, predicting whether a patient has a disease and assigning their health data to “disease” or “no disease” categories is a classification problem.
  • Regression: Establishing a relationship between dependent and independent variables. Estimating the relationship between the budget for advertising and the sales of a product is an example of a regression problem.

Figure 1: Supervised Learning Example 1

The image shows the supervised learning example. The training dataset has all kinds of fruits with different labels. the test set only has 2 types of fruit.

For example, training machine learning models of self-driving cars involve annotated video data. Individual objects in videos are annotated, which allows machines to predict the movements of objects.

Other terms to describe data annotation include data labeling, data tagging, data classification, or machine learning training data generation.

Why does data annotation matter?

Annotated data is the lifeblood of supervised learning models since the performance and accuracy of such models depend on the quality and quantity of annotated data. Machines can not see images and videos as we do. Data annotation makes the different data types machine-readable. Annotated data matters because:

  • Machine learning models have a wide variety of critical applications (e.g., healthcare) where erroneous AI/ML models can be dangerous
  • Finding high-quality annotated data is one of the primary challenges of building accurate machine-learning models

Here is a data-driven list of the top data annotation services on the market.

Gathering data is a prerequisite for annotation. To help you obtain the right datasets, here is some research:

  • Top data crowdsourcing platforms on the market
  • Guide to AI data collection.
  • Data-driven list of data collection/harvesting services.

What are the different types of data annotation?

Different data annotation techniques can be used depending on the machine learning application. Some of the most common types are:

Reinforcement learning with human feedback (RLHF) was identified in 2017. 2 It increased in popularity significantly in 2022 after the success of large language models (LLMS) like ChatGPT which leveraged the technology. These are the two main types of RLHF:

  • Humans generating suitable responses to train LLMs
  • Humans annotating (i.e. selecting) better responses among multiple LLM responses.

Human labor is expensive and AI companies are also leveraging reinforcement learning from AI feedback (RLAIF) to scale their annotations cost effectively in cases where AI models are confident about their feedback. 3

2. Text annotation

Text annotation trains machines to better understand the text. For example, chatbots can identify users’ requests with the keywords taught to the machine and offer solutions. If annotations are inaccurate, the machine is unlikely to provide a useful solution. Better text annotations provide a better customer experience. During the data annotation process, with text annotation, some specific keywords, sentences, etc., are assigned to data points. Comprehensive text annotations are crucial for accurate machine training. Some types of text annotation are:

2.1. Semantic annotation

Semantic annotation (see figure 2) is the process of tagging text documents. By tagging documents with relevant concepts, semantic annotation makes unstructured content easier to find. Computers can interpret and read the relationship between a specific part of metadata and a resource described by semantic annotation.

Figure 2: Semantic Annotation Example 4

The image shows an example of tagged words in a text document.

2.2. Intent annotation

For example, the sentence “I want to chat with David” indicates a request. Intent annotation analyzes the needs behind such texts and categorizes them, such as requests and approvals.

2.3. Sentiment annotation

Sentiment annotation (see Figure 3) tags the emotions within the text and helps machines recognize human emotions through words. Machine learning models are trained with sentiment annotation data to find the true emotions within the text. For example, by reading the comments left by customers about the products, ML models understand the attitude and emotion behind the text and then make the relevant labeling such as positive, negative, or neutral.

Figure 3: Sentiment Annotation Example 5

The image shows the process of labeling texts in documents

3. Text categorization

Text categorization assigns categories to the sentences in the document or the whole paragraph in accordance with the subject. Users can easily find the information they are looking for on the website.

4. Image annotation

Image annotation is the process of labeling images (see figure 4)  to train an AI or ML model. For example, a machine learning model gains a high level of comprehension like a human with tagged digital images and can interpret the images it sees. With data annotation, objects in any image are labeled. Depending on the use case, the number of labels on the image may increase. There are four fundamental types of image annotation:

4.1. Image classification

First, the machine trained with annotated images then determines what an image represents with the predefined annotated images.

4.2. Object recognition/detection

Object recognition/detection is a further version of image classification. It is the correct description of the numbers and exact positions of entities in the image. While a label is assigned to the entire image in image classification, object recognition labels entities separately. For example, with image classification, the image is labeled as day or night. Object recognition individually tags various entities in an image, such as a bicycle, tree, or table.

4.3. Segmentation

Segmentation is a more advanced form of image annotation. In order to analyze the image more easily, it divides the image into multiple segments, and these parts are called image objects. There are three types of image segmentation:

  • Semantic segmentation: Label similar objects in the image according to their properties, such as their size and location.
  • Instance segmentation: Each entity in the image can be labeled. It defines the properties of entities such as position and number.
  • Panoptic segmentation: Both semantic and instance segmentations are used by combining.

Figure 4: Image annotation example 6

An image showing the different types of image annotation including classification, Semantic segmentation, object detection, and instance segmentation.

5. Video annotation

Video annotation is the process of teaching computers to recognize objects from videos. Image and video annotation are types of data annotation methods that are performed to train computer vision (CV) systems , which is a subfield of artificial intelligence (AI).

Video annotation for a retail store surveillance system:

Click here to learn more about video annotation.

6. Audio annotation

Audio annotation is a type of data annotation that involves classifying components in audio data. Like all other types of annotation (such as image and text annotation), audio annotation requires manual labeling and specialized software. Solutions based on natural language processing (NLP) rely on audio annotation, and as their market grows (projected to grow 14 times between 2017 and 2025), the demand and importance of quality audio annotation will grow as well.

Audio annotation can be done through software that allows data annotators to label audio data with relevant words or phrases. For example, they may be asked to label a sound of a person coughing as “cough.”

Audio annotation can be: 

  • In-house, completed by that company’s employees.
  • Outsourced (i.e., done by a third-party company.)
  • Crowdsourced . Crowdsourced data annotation involves using a large network of data annotators to label data through an online platform.

Learn more about audio annotation.

7. Industry-specific data annotation

Each industry uses data annotation differently. Some industries use one type of annotation, and others use a combination to annotate their data. This section highlights some of the industry-specific types of data annotation.

  • Medical data annotation: Medical data annotation is used to annotate data such as medical images (MRI scans), EMRs, and clinical notes, etc. This type of data annotation helps develop computer vision-enabled systems for disease diagnosis and automated medical data analysis.
  • Retail data annotation: Retail data annotation is used to annotate retail data such as product images, customer data, and sentiment data . This type of annotation helps create and train accurate AI/ML models to determine the sentiment of customers, product recommendations , etc.
  • Finance data annotation: Finance data annotation is used to annotate data such as financial documents, transactional data, etc. This type of annotation helps develop AI/ML systems, such as fraud and compliance issues detection systems.
  • Automotive data annotation: This industry-specific annotation is used to annotate data from autonomous vehicles, such as data from cameras and lidar sensors. This annotation type helps develop models that can detect objects in the environment and other data points for autonomous vehicle systems.
  • Industrial data annotation: Industrial data annotation is used to annotate data from industrial applications, such as manufacturing images, maintenance data, safety data, quality control, etc. This type of data annotation helps create models that can detect anomalies in production processes and ensure worker safety.

What is the difference between data annotation and data labeling?

Data annotation and data labeling mean the same thing. You will come across articles that try to explain them in different ways and make up a difference. For example, some sources claim that data labeling is a subset of data annotation where data elements are assigned labels according to predefined rules or criteria. However, based on our discussions with vendors in this space and with data annotation users, we do not see major differences between these concepts.

What are the main challenges of data annotation?

  • Cost of annotating data: Data annotation can be done either manually or automatically. However, manually annotating data requires a lot of effort, and you also need to maintain the quality of the data.
  • Accuracy of annotation : Human errors can lead to poor data quality, and these have a direct impact on the prediction of AI/ML models. Gartner’s study highlights that poor data quality costs companies 15% of their revenue.

What are the best practices for data annotation?

  • Start with the correct data structure: Focus on creating data labels that are specific enough to be useful but still general enough to capture all possible variations in data sets.
  • Prepare detailed and easy-to-read instructions: Develop data annotation guidelines and best practices to ensure data consistency and accuracy across different data annotators.
  • Optimize the amount of annotation work: Annotation is costlier and cheaper alternatives need to be examined. You can work with a data collection service that offers pre-labeled datasets.
  • Collect data if necessary: If you don’t annotate enough data for machine learning models, their quality can suffer. You can work with data collection companies to collect more data.
  • Leverage outsourcing or crowdsourcing if data annotation requirements become too large and time-consuming for internal resources.
  • Support humans with machines: Use a combination of machine learning algorithms (data annotation software) with a human-in-the-loop approach to help humans focus on the hardest cases and increase the diversity of the training data set. Labeling data that the machine learning model can correctly process has limited value. 
  • Regularly test your data annotations for quality assurance purposes.
  • Have multiple data annotators review each other’s work for accuracy and consistency in labeling datasets.
  • Stay compliant: Carefully consider privacy and ethical issues when annotating sensitive data sets, such as images containing people or health records. Lack of compliance with local rules can damage your company’s reputation.

By following these data annotation best practices, you can ensure that your data sets are accurately labeled and accessible to data scientists and fuel your data-hungry projects.

You can also check our video annotation tools list to choose the fit that best suits your annotation needs.

If you have questions about data annotation, we would like to help:

External links

  • 1. Diego Calvo. (2019). Supervised learning. Diego Calvo. Accessed: 29/September/2023.
  • 2. Christiano P.; Leike J.; Brown T.B.; Martic M.; Legg S.; Amodei D. (2017). “ Deep reinforcement learning from human preferences “
  • 3. Bai Y.; et al. (2022). “ Constitutional AI: Harmlessness from AI Feedback ”. Retrieved January 1, 2024
  • 4. Articles Hubspot. (2019). What Is Text Annotation in Machine Learning, Examples and How it’s Done? . Accessed: 29/September/2023.
  • 5. Sentiment Annotation – Quick Start Guide. Accessed: 29/September/2023.
  • 6. Ashely John. (2020). Why Data & Data Annotation Make or Break AI. Medium. Accessed: 29/September/2023.

annotations good practice

Next to Read

Quick guide to video annotation tools and types in 2024, video annotation: in-depth guide and use cases in 2024, top 10 open source data labeling/annotation platforms in 2024.

Your email address will not be published. All fields are required.

Related research

Data Preprocessing in 2024: Importance & 5 Steps

Data Preprocessing in 2024: Importance & 5 Steps

Top 8 Data Masking Techniques: Best Practices & Use Cases in '24

Top 8 Data Masking Techniques: Best Practices & Use Cases in '24

Teaching Student Annotation: Constructing Meaning Through Connections

Teaching Student Annotation: Constructing Meaning Through Connections

  • Resources & Preparation
  • Instructional Plan
  • Related Resources

Students learn about the purposes and techniques of annotation by examining text closely and critically. They study sample annotations and identify the purposes annotation can serve. Students then practice annotation through a careful reading of a story excerpt, using specific guidelines and writing as many annotations as possible. Students then work in pairs to peer review their annotations, practice using footnotes and PowerPoint to present annotations, and reflect on how creating annotations can change a reader's perspective through personal connection with text.

Featured Resources

  • Making Annotations: A User's Guide : Use this resource guide to help students make connections with text through definition, analysis of author purpose, paraphrasing, personal identification, explaining historical context, and more.

From Theory to Practice

In his English Journal article " I'll Have Mine Annotated, Please: Helping Students Make Connections with Text" Matthew D. Brown expresses a basic truth in English Language Arts instruction: "Reading is one thing, but getting something of value from what we read is another" (73). Brown uses the avenue of personal connection to facilitate the valuable outcomes that can result from reading and interacting with text. He begins with student-centered questions such as, "What were they thinking about as they read? What connections were they making? What questions did they have, and could they find answers to those questions?" (73). Brown's questions lead to providing students with instruction and opportunities that align with the NCTE Principles of Adolescent Literacy Reform: A Policy Research Brief by "link[ing] their personal experiences and their texts, making connections between the students' existing literacy resources and the ones necessary for various disciplines" (5). Further Reading

Common Core Standards

This resource has been aligned to the Common Core State Standards for states in which they have been adopted. If a state does not appear in the drop-down, CCSS alignments are forthcoming.

State Standards

This lesson has been aligned to standards in the following states. If a state does not appear in the drop-down, standard alignments are not currently available for that state.

NCTE/IRA National Standards for the English Language Arts

  • 1. Students read a wide range of print and nonprint texts to build an understanding of texts, of themselves, and of the cultures of the United States and the world; to acquire new information; to respond to the needs and demands of society and the workplace; and for personal fulfillment. Among these texts are fiction and nonfiction, classic and contemporary works.
  • 2. Students read a wide range of literature from many periods in many genres to build an understanding of the many dimensions (e.g., philosophical, ethical, aesthetic) of human experience.
  • 3. Students apply a wide range of strategies to comprehend, interpret, evaluate, and appreciate texts. They draw on their prior experience, their interactions with other readers and writers, their knowledge of word meaning and of other texts, their word identification strategies, and their understanding of textual features (e.g., sound-letter correspondence, sentence structure, context, graphics).
  • 4. Students adjust their use of spoken, written, and visual language (e.g., conventions, style, vocabulary) to communicate effectively with a variety of audiences and for different purposes.
  • 5. Students employ a wide range of strategies as they write and use different writing process elements appropriately to communicate with different audiences for a variety of purposes.
  • 6. Students apply knowledge of language structure, language conventions (e.g., spelling and punctuation), media techniques, figurative language, and genre to create, critique, and discuss print and nonprint texts.
  • 7. Students conduct research on issues and interests by generating ideas and questions, and by posing problems. They gather, evaluate, and synthesize data from a variety of sources (e.g., print and nonprint texts, artifacts, people) to communicate their discoveries in ways that suit their purpose and audience.

Materials and Technology

  • Copies of "Eleven" by Sandra Cisceros or other text appropriate for the activities in this lesson
  • Colored Pencils
  • Sample Annotation PowerPoint on The Pearl
  • Making Annotations: A User's Guide or one students create after discussion
  • Annotation Sheet
  • Student Sample Annotations from "Eleven"
  • Annotation Peer Review Guide
  • Example Student Brainstorming for Annotation
  • Sample Revised and Published Annotations Using Footnotes

Preparation

  • Find sample annotated texts to share with your students. Shakespeare's plays work well since many of his texts are annotated.  Red Reader editions published by Discovery Teacher have great user-friendly annotations geared toward young adult readers.  Look for selections that are engaging—ones that offer more than vocabulary definitions and give a variety of annotations beyond explanation and analysis.
  • Alternatively, search Google Books for any text with annotations.  A search for Romeo and Juliet , for example, will bring up numerous versions that can be viewed directly online.
  • While much of the work will be done by students, it is useful to take some time to think about the role of annotations in a text.  You will have students identify the functions of annotations, but it is always helpful if you have your own list of uses of annotations so that you can help guide students in this area of instruction if necessary.
  • Make copies of all necessary handouts.
  • Arrange for students to have access to Internet-connected computers if they will be doing their annotations in an online interactive.
  • Test the Literary Graffiti and Webbing Tool interactives on your computers to familiarize yourself with the tools and ensure that you have the Flash plug-in installed. You can download the plug-in from the technical support page.

Student Objectives

Students will:

  • examine and analyze text closely, critically, and carefully.
  • make personal, meaningful connections with text.
  • clearly communicate their ideas about a piece of text through writing, revision, and publication.

Session One

  • Begin the session by asking students if they are familar with the word annotation . Point out the words note and notation as clues to the word's meaning. If students know the word, proceed with the next step. If students are unfamiliar, ask them to determine what the word means by seeing what the texts you pass out in the next step have in common.
  • Pass out a variety of sample texts that use annotations. If you are using Google Books , direct students to texts online to have them examine the annotations that are used.
  • Have the students skim the texts and carefully examine the annotations.  Encourage students to begin to see the variety of ways that an editor of a text uses annotations.
  • Working with a small group of their peers, students should create a list that shows what effective annotations might do.
  • give definitions to difficult and unfamiliar words.
  • give background information, especially explaining customs, traditions, and ways of living that may be unfamiliar to the reader.
  • help explain what is going on in the text.
  • make connections to other texts.
  • point out the use of literary techniques and how they add meaning to the text.
  • can use humor (or other styles that might be quite different from the main text).
  • reveal that the writer of these annotations knows his or her reader well.
  • The process of generating this list should move into a discussion about where these annotations came from—who wrote them and why.  Guide students to think about the person who wrote these ideas, who looked at the text and did more than just read it, and who made a connection with the text.  It is important here that students begin to realize that their understanding of what they have read comes from their interaction with what is on the page.  You may wish to jumpstart the conversation by telling students about connections you make with watching films, as students may be more aware of doing so themselves.
  • touch them emotionally, making them feel happiness as well as sadness.
  • remind them of childhood experiences.
  • teach them something new.
  • change their perspective on an issue.
  • help them see how they can better relate to others around them.
  • help them see the world through someone else's experiences.
  • Before beginning the next lesson, create your Annotation Guide reflecting the different functions of annotation the class discussed today (or use the Sample Annotation Guide ).

Session Two

  • Pass out "Eleven" by Sandra Cisneros or any other text appropriate for your students and this activity.
  • Read and discuss the story as needed, but resist spending too much time with the story since the goal of annotation is to get the students to connect with the text in their own ways.
  • Pass out the Sample Annotation Guide or the one the class created and review the various ideas that were generated during the previous session, helping students to begin to think of the various ways that they can begin to connect to the story "Eleven."
  • Pass out the Annotation Sheet and ask the students to choose a particularly memorable section of the story, a section large enough to fill up the lines given to them on the Annotation Sheet .  (NOTE: While you could have the students create annotations in the margins of the entire text, isolating a small portion of the text will make the students' first attempt at annotations less daunting and more manageable. You can also use ReadWriteThink interactives Literary Graffiti or Webbing Tool at this point in the instructional process, replacing or supplementing the Annotation Sheet handout.)
  • Share with students the Student Sample Annotations from "Eleven" and use the opportunity to review the various purposes of annotating and preview directions for the activity.
  • Pass out the colored pencils.  Make sure that students can each use a variety of colors in their annotating.  Sharing pencils among members of a small group works best.
  • Have the students find a word, phrase, or sentence on their Annotation Sheet that is meaningful or significant to them.  Have them lightly color over that word, phrase, or sentence with one of their colored pencils.
  • Students should then draw a line out toward the margin from what they just highlighted on their Annotation Sheet .
  • Now students annotate their selected text.  Using the Sample Annotation Guide , students should write an annotation for the highlighted text.  They can talk about how they feel or discuss what images come to mind or share experiences that they have had.  Any connection with that part of the text should be encouraged at this entry-level stage.
  • Repeat this process several times.  Encourage students to use a variety of annotations from the Sample Annotation Guide .  But, most importantly, encourage them to make as many annotations as possible.
  • What did they get out of writing annotations?
  • What did they learn about the text that they didn't see before?
  • How might this make them better readers?
  • Students should take the time to share these reflections with each other and with the whole class. Collect responses to evaluate levels of engagement and to find any questions or concerns you may need to address.

Session Three

  • Return annotations from the previous session and address any questions or concerns.
  • Explain that, working in pairs, the students will examine each other's annotations and look for ideas that have the potential for further development and revision. 
  • Distribute copies the Annotation Peer Review Guide and explain how it will help them work together to select the best ideas that they have presented in their annotations. Peer review partners should label each annotation, comment on it, and look for several annotations that would benefit from revision and continued thinking.
  • Have each pair narrow down their ideas to the four or five most significant annotations per student.
  • Once this is done, give the students time to start revising and developing their ideas.  Encourage them to elaborate on their ideas by explaining connections more fully, doing basic research to answer questions or find necessary information, or providing whatever other development would be appropriate.
  • Circulate the room to look at what the students have chosen so that you can guide them with their development and writing.  If you see the need to offer more guiding feedback, collecting the annotation revisions during this process may be helpful.

Session Four

  • Once students have revised and developed a few of their annotations on their own, students should begin work toward a final draft.
  • The students exchange their revised annotations.
  • What is one thing that I really liked in this set of annotations?
  • What is one thing that I found confusing, needed more explanation, etc.?
  • If this were my set of annotations, what is one thing that I would change?
  • Encourage students to rely heavily on the Sample Annotation Guide and the Annotation Peer Review Guide to make these comments during the peer review process. They should be looking to see that there are a variety of annotations and that the annotations dig deeper than just surface comments (e.g., definitions) and move toward meaningful personal connections and even literary analysis.
  • Take the original format of the annotation sheet and have the students type up their work using colored text.
  • Teach the students how to footnote, and then have them use this footnoting technique for the final draft of their annotations. See the Sample Student Brainstorming for Annotation and Sample Revised and Published Annotations Using Footnotes on The Great Gatsby . If using Microsoft Word, visit the resource Insert a Footnote or Endnote for information on how to use this feature in Word.
  • Create a PowerPoint in which the first slide is the original text. The phrases are then highlighted in different colors and hyperlinked to other slides in the presentation which contain the annotations. See the Sample Annotation PowerPoint on The Pearl, and visit PowerPoint in the Classroom for tutorials on how to make the best use of PowerPoint functions.
  • What did they learn by doing this activity?
  • How did these annotations change their perspective on the text?
  • In what ways did their thinking change as they worked through the drafting, rewriting, and revising of their annotations?
  • Make sure that students are given time to share these reflections with each other and with the whole class.
  • annotate a whole text, using the margins for annotating
  • use sticky notes in textbooks or novels as a way to annotate larger works
  • use annotations as part of a formal essay to provide personal comments to supplement the analysis they have written.
  • Assessing Cultural Relevance: Exploring Personal Connections to a Text
  • Graffiti Wall: Discussing and Responding to Literature Using Graphics
  • In Literature, Interpretation Is the Thing
  • Literary Scrapbooks Online: An Electronic Reader-Response Project
  • Reader Response in Hypertext: Making Personal Connections to Literature
  • Creative Outlining—From Freewriting to Formalizing

Student Assessment / Reflections

  • Review and comment on student reflections after each step of the annotation drafting and revision process.
  • If you use this lesson as an introduction to the idea of annotation, the focus of the assessment should be on the variety of annotations a student makes.  Even so, teachers should be able to observe if students were able to move beyond surface connections (defining words, summarizing the story, and so forth) to deeper connections with the text (personal feelings, relating evens to past experiences, and so forth).  Use an adaptation of the Annotation Peer Review Guide in this process.
  • For those who take this lesson to its completion by having students generate a final published draft, the focus should move from just looking for a variety of annotations to focusing on the quality of the annotations.  By working through the writing process with these annotations, students should have been able to comment meaningfully beyond what they began with in their “rough draft.”  This should be most evident in the reflections students write in response to the process of creating annotations. Again, a modified version of the Annotation Peer Review Guide would be suitable for this evaluative purpose.
  • Student Interactives
  • Strategy Guides
  • Professional Library
  • Calendar Activities

The Webbing Tool provides a free-form graphic organizer for activities that ask students to pursue hypertextual thinking and writing.

  • Print this resource

Explore Resources by Grade

  • Kindergarten K

Epic Book Society Homepage Banner

Why & How To Annotate A Book

Last Updated on August 18, 2023 by Louisa

I remember getting scolded in school for writing in my books, but I’ve since learned that it’s actually good practice.

Of course, you need to know how to annotate a book the right way, you can’t just doodle in the margins (which admittedly, is what I did).

As a teacher, I always encourage my students to write notes in the margins or to highlight texts in their books. The reason for this is that it helps to build a connection to the book, increases understanding, and also helps students remember what they have read.

But it’s not just students who should annotate their books. It’s good practice for any reader to annotate their books for fun to personalize their reading experience.

But if you don’t know why you should start annotating your books or how to do it, here are some tips on how to do it effectively…

Why Annotate Your Books?

Before I tell you my top tips for how to annotate your books for fun, let’s quickly think about why you should annotate your books.

This may be my teacher’s brain talking, but I feel this practice can be applied to any reader, no matter your age or reason for reading.

woman annotating a book

1. It enhances your comprehension

The main reason why you would annotate a book is to gain a better understanding of the story.

By adding your thoughts and insights to the margin of the text, you’re engaging with the material on a deeper level.

You’ll be able to make connections, identify themes, and understand complex ideas better. You can also write down questions that arise, which may be answered later on.

2. It personalizes your reading experience

Annotating your books is like having a conversation with the author. You’re sharing your reactions, asking questions, and offering feedback on the material.

It personalizes the reading experience and makes it more meaningful.

3. It helps you to retain more information

Annotating your books can help you retain information better. When you write down key points, ideas, and themes, it reinforces the information in your brain, making it easier to recall later.

Tips for Effective Book Annotation

Now you know why you should annotate your books, here are my tips for how to annotate your books for fun.

1. Use a pencil

annotations good practice

Now I really do sound like a teacher talking!

It’s always a good idea to use a pencil instead of a pen so that you can erase your annotations later or adjust them while you’re reading.

As I mentioned earlier, you can write down questions that come up when you’re reading, and then they might be answered later. You can go back and erase the questions once they have been answered, or write “answer on page 10” for example.

2. Understand when to start annotating

The first thing to remember when annotating your book is that you should not be scribbling all over the page and you should only annotate when you need to.

One of the purposes of annotating is to help build memory and to help with study.

If you need to refer back to anything in your book and you’ve written a little poem around the answer, well you’re going to find it hard to make sense of your annotations.

You should only write when you feel it’s absolutely necessary. I always think when I’m annotating a book, to only write something that will be useful to you later on.

3. Highlight or underline what’s important

Sometimes you don’t need to write sentences or paragraphs in the margin, you can simply underline key quotes or messages that you think are important.

You might want to annotate next to what you have underlined to explain why you have highlighted it.

4. Know what to write about

In your annotations, you should write about key themes, and ideas, and underline quotes that you find interesting or essential to the material.

You should write your reflections on the text, what you think it means and what you think is being said between the lines.

5. Use symbols and abbreviations

Develop a set of symbols and abbreviations to help you mark the text quickly and easily.

For example, you could use an asterisk (*) to indicate where a paragraph corresponds to a section you have underlined.

You could also draw a star symbol next to significant points or underline phrases that resonate with you.

6. Keep it organized

annotations good practice

Use different colored highlighters, sticky notes, or tabs to organize your annotations.

For example, you could use pink highlights for quotes, yellow for important themes, blue for questions, and green for your personal reactions.

You can also dog-ear pages (folding the corner) where you think there is something important. I personally don’t like doing this as I like to keep my books tidy , but you can use bookmarks or sticky tabs instead.

If you don’t want to write directly on the page, you can also use a post-it note.

Examples of Annotations

If you’re ready to start annotating but not sure where to start, here are some really cool examples of people who have annotated their books in different ways.

View this post on Instagram A post shared by ✧ lisa ✧ | bookstagram (@buryme.withmybooks)

This person has used different colored sticky tabs to identify different themes in the book. This is a nice way to get a visual of the main themes in a book, especially if you need to do a book review or write a book report.

View this post on Instagram A post shared by lili | bookstagram (@lilisread)

This person has highlighted a quote they love and drawn hearts around it to signal that they like this quote. The use of doodling helps to add color to the book and make it pleasing to look at, but I personally think it needs to be done sparingly as it can be distracting from the text, which is not the goal.

View this post on Instagram A post shared by bella ♡ (@belladaneer)

This person has written their thoughts after highlighting a sentence that resonates with them. They also have a system of when to highlight something and when to underline it. It appears they have highlighted dialogue and underlined important quotes that resonate with them.

View this post on Instagram A post shared by @hopescure

This is a classic example of how to annotate a book for study. The reader has circled emotive words and underlined quotes that resonate with them. This is a good way to understand the emotion of the characters.

FAQs About How To Annotate Books

Can you annotate on kindle.

The Kindle Scribe feature, which is only on models after 2022, allows readers to annotate books. The Kindle Scribe also comes with a special digital pen in which you can do this. Other models allow you to highlight texts but not add your own annotations.

Can you annotate on Apple Books?

Yes, you can highlight and add notes on books using Apple Books.

Annotating your books is an engaging and effective way to enhance your reading experience.

By adding your thoughts to the margin of the text, you’ll develop a more profound and meaningful understanding of the material.

So, grab a pencil and start annotating.

annotations good practice

About Louisa Smith

Editor/Founder - Epic Book Society

Louisa is the Founder, Editor, and Head Honcho of Epic Book Society. She was born and raised in the United Kingdom and graduated from the University for the Creative Arts with a degree in Journalism. Louisa began her writing career at the age of 7 when her poetry was published in an anthology of poems to celebrate the Queen's Jubilee. Upon graduating university, she spent several years working as a journalist writing about books before transitioning to become a Primary School Teacher. Louisa loves all genres of books, but her favorites are Sci-Fi, Romance, Fantasy, and Young Adult Fiction. Read more Louisa's story here .

Leave a Comment Cancel reply

Save my name, email, and website in this browser for the next time I comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed .

Site Navigation

Affiliate Disclosure : This website uses affiliate links, meaning I may earn a small commission through purchases made through this site at no extra cost to you. Epic Book Society is a participant in the Amazon Services LLC Associates Program Affiliate Program. As an Amazon Associate, I earn from qualifying purchases.

Join the Society!

Want to connect with other bookworms?

We've created a place where book lovers can come together and share recommendations and ideas with each other. There will be no spam from us, that's a promise .

Click the button below to join our exclusive Facebook group.

Contact Us: [email protected]

Follow our socials:

© 2023 Epic Book Society • Built with  GeneratePress

sadlier-connect

  • Mathematics
  • Reading and Writing
  • Intervention
  • Professional Learning
  • Virtual Events
  • What is Phonics?
  • Teaching Grammar
  • Vocabulary Games
  • What is Virtual Learning?
  • About Sadlier
  • Find a Sales Representative
  • International Distributors
  • International Programs
  • Online Catalogs
  • Sadlier School Site Map
  • Pricing & Ordering Information
  • Sadlier’s W-9
  • Sadlier’s Sole Source Letter
  • Sadlier’s Credit Application
  • Privacy Policy
  • Return Policy
  • Terms & Conditions

Sadlier's English Language Arts Blog

annotations good practice

  • Author Interviews
  • Interactive Read Alouds
  • Close Reading
  • Vocabulary/Vocab Gal
  • Writing with Vocabulary
  • Assessments
  • Charts/Posters
  • Graphic Organizers
  • Back to School
  • End of School
  • Classroom Management
  • Grammar & Writing
  • Thinking Routines
  • About Our Bloggers

September 21, 2023 ELA PD - Literacy , ELA K-5 , ELA 6-8 , ELA Focus - Close Reading , ELA Resources - Tip Sheets , Core Literacy

Annotating text strategies that enhance close reading [free printable], by: erin lynch.

One of the most important skills I teach my students as we begin to work on close reading is how to annotate texts. Teaching annotation strategies will help students keep track of key ideas while reading. In this article, you'll discover annotating strategies that will enhance close reading and free printable resources you can use in the classroom!

 Download the Annotating Practice Kit now!

annotating-text-strategies-for-close-reading-teaching-annotation

Annotating Text Strategies

Annotating a text is when the reader “marks up” a text to indicate places of importance or something they don’t understand. Sometimes students annotate by circling a word, underlining a phrase or highlighting a sentence. Annotating also includes writing notes in the margin; these notes might be thoughts or questions about the text. This process of annotating helps the reader keep track of ideas and questions and supports deeper understanding of the text.

Teaching annotation strategies will help students keep track of key ideas, and will help them formulate thoughts and questions they have while reading.

Benefits of Annotating a Text

The benefits of annotation include:

  • Keeping track of key ideas and questions
  • Helping formulate thoughts and questions for deeper understanding
  • Fostering analyzing and interpreting texts
  • Encouraging the reader to make inferences and draw conclusions about the text
  • Allowing the reader to easily refer back to the text without rereading the text in its entirety

Annotating With a Purpose

Students are taught to read with a purpose, and they should also be taught to annotate with a purpose. Teaching students to annotate with a purpose will help them focus on what is most important about the text.

When teaching annotation I instruct students to use the following symbols:

Underline key ideas and major points.

Write a ? next to anything that is confusing, such as unfamiliar words or unclear information.

Circle key words or phrases.

Put an ! next to surprising or important information or information that helps you make a connection.

Printable Annotation Examples and Activities

Model for annotating a text , grades 2–5.

Download my Model for Annotating a Text which uses the poem The Spider and the Fly by Mary Howitt. My students have enjoyed using this poem as an introduction to the close reading of poetry and the skill of annotating.

The Model for Annotating a Text download includes an instructional tip sheet and annotation examples for students. You can make individual copies for your students to keep handy, or enlarge the annotation example to a poster size and hang it in the classroom!

Here's how to use the Model for Annotating a Text :

Explain to students that the annotations of skillful readers identify what they don’t understand and point out major facts or ideas they want to remember and use in their discussions and writing. Annotation also encourages readers to make inferences and to draw conclusions about the text, as well as to make interpretations on a deeper level.

Next, review the symbols students should use when annotating a text. Caution students that over-annotating will be confusing rather than helpful.

Then read the poem The Spider and the Fly by Mary Howitt and pause to model how to annotate with your students.

Download my Model for Annotating a Text which uses the poem The Spider and the Fly by Mary Howitt.

"I Have a Dream" Close Reading Kit, Grades 3–8

My "I Have a Dream" Close Reading Kit also includes resources for teaching close reading annotation! In the kit you'll find an instructional guide for teachers and annotations for the first 10 paragraphs of Dr. Martin Luther King Jr.'s "I Have a Dream" speech. Use this kit to model close reading in your classroom!

My "I Have a Dream" Close Reading Kit also includes resources for teaching close reading annotation!

Annotating Practice Worksheets Kit, Grades 1–8

Once your students have learned the correct way to annotate a text, have them practice annotating with a purpose! With the Annotating Practice Kit , students will practice their annotation skills while reading the following articles:

  • The First Playground
  • The Dove and the Ant
  • Sea Otters!

Use the worksheets and text excerpts in the Annotating Practice Kit to get students annotating with a purpose.

In Conclusion

Teaching your students how to annotate with a purpose will help them keep track of key ideas, and will help them formulate thoughts and questions they have while reading. It also encourages the reader to make inferences and draw conclusions about the text, as well as, make interpretations on a deeper level. Annotating allows the reader to easily refer back to the text without rereading the text in its entirety.

Grab the free downloads today and use them with students as they begin to annotate texts.

annotations good practice

How to annotate a sketchbook: a guide for art students

Last Updated on November 29, 2021

High school art students often have to submit sketchbooks, art journals, or other preparatory material that includes writing as well as visual material. This annotation plays an important role in how examiners assess and respond to your work. Although each qualification has their own assessment criteria and requirements, almost all high school art programs have similar standards and expectations when it comes to annotation. This article sets out best practice when it comes to producing outstanding sketchbook annotation, and includes examples from students who achieved excellent results around the world. It is likely to be particularly helpful for students who are wondering how to annotate an A Level Art sketchbook, those wishing to conduct formal analysis for an IB Visual Arts Process Portfolio, or those looking for GCSE Art annotation examples.

Want more guidance? Some of this material and much more is in our new book: Outstanding High School Sketchbooks . This book has high-resolution images so that fine details and annotation are clear, making it an excellent resource for students and schools.  Learn more !

Communicate intentions

It is helpful to begin a sketchbook by discussing your intentions, initial ideas, or design brief, including any requirements and restrictions set for the project. (Some students also include brainstorming and mind maps at this stage of their project).

READ NEXT: How to make an artist website (and why you need one)

Sketchbook annotation example

Demonstrate subject-specific knowledge

Aim to communicate your thoughts in an informed, knowledgeable manner, using a range of art-related vocabulary and terminology. This knowledge may be the result of formal classroom lessons, individual research, or personal art-making experience.

A Level Art sketchbook annotation

Include personal responses

Aim to record personal reflections, evaluations, and judgments, rather than regurgitating facts or the views of others. The aim is to provide insights into your thinking and decision-making processes. Visual art examiners do not want to read long lists of facts, excessively detailed descriptions of technical processes, extensive artist biographies, or long-winded passages documenting broad periods of art history. Use research to inform your own responses. It is not acceptable to copy written information directly from other sources, although small portions may be quoted and referenced.

AS Level Art annotation

Avoid the obvious

Self-explanatory statements—such as ‘this is a drawing of a shoe’—are unnecessary. Such comments do not communicate any new information to the examiner.

GCSE Art annotation

Communicate with clarity

Write in a succinct and clear manner. A sketchbook should not contain endless pages of waffle; this wastes the examiner’s time as well as your own.  You can record thoughts in any combination of legible formats: mind maps, questions, bulleted summaries, or complete sentences and paragraphs. Whichever format you choose, avoid ‘txt’ language and ensure that you proofread for spelling errors. These indicate carelessness and may suggest that the work belongs to a low-caliber student.

Don’t feel you have to write in full sentences. Noting key words or phrases can be just as effective. Annotating your work, GCSE, Art & Design, BBC Bitesize Guides

sketchbook annotation analyzing composition

Reference all images, text, and ideas from other sources

All content from other sources should be formally acknowledged and credited. This is true even when you are interpreting the content rather than directly copying it. It is helpful to cite the artist underneath the relevant image (artist name, artwork title, media, date, and image source). Also, provide brief details about any visits to studios, galleries, or museums, noting that you visited in person. Label any original photographs so that it is clear to the examiner which images are your own.

GCSE Art sketchbook writing example

Critically analyze artwork

Art analysis is an integral component of most high school art programs. Make sure you also analyze your own artwork, appraising the outcomes against your original intentions and the assessment objectives. These insights should inform and influence subsequent work.

Art analysis annotation example

For further assistance with sketchbook annotation, please read our guide to analyzing artwork . This is a comprehensive art annotation help sheet, with art annotation vocabulary formulated into questions to help guide students through how to annotate an artwork.

Need more help with creating a sketchbook?

This article is part of a series we have published about high school sketchbooks. You may also be interested in viewing our other sketchbook resources:

  • Painting / fine art sketchbooks
  • Photography sketchbooks
  • Graphic design sketchbooks
  • Textile and fashion design sketchbooks
  • Sculpture, architecture, and 3D Design sketchbooks
  • Digital sketchbooks
  • Tips for producing an amazing high school sketchbook (this was originally written for A Level Art and IGCSE/GCSE Art students, but is relevant for students creating a sketchbook, art journal, or visual diary as part of any high school art qualification)

Amiria Gale

Amiria has been an Art & Design teacher and a Curriculum Co-ordinator for seven years, responsible for the course design and assessment of student work in two high-achieving Auckland schools. She has a Bachelor of Architectural Studies, Bachelor of Architecture (First Class Honours) and a Graduate Diploma of Teaching. Amiria is a CIE Accredited Art & Design Coursework Assessor.

JOIN OVER 21,000 PEOPLE WHO RECEIVE OUR FREE NEWSLETTER

You will be notified first when free resources are available: Art project ideas, teaching handouts, printable lesson plans, tips and advice from experienced teachers. What are you waiting for?

Email Address*

We send emails monthly. And don’t worry, we hate spam too! Unsubscribe at any time.

High school sketchbooks publication

  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Teaching Expertise

  • Classroom Ideas
  • Teacher’s Life
  • Deals & Shopping
  • Privacy Policy

17 Awesome Annotation Activities

March 23, 2023 //  by  Laura Spry

By teaching kids annotation skills we can greatly improve their reading comprehension and critical thinking skills. It’s important to first explain what annotation means so that learners understand why they will be working through this process. We’ve sourced 17 awesome annotation activities to get you started. Let’s take a look.

1. Poetry Annotation

To successfully annotate poetry, students must analyze and interpret the different elements of a poem in order to gain a deeper understanding of its literary devices and meaning. This activity teaches students to focus on the importance of looking into depth and complexity by focusing on the elements of speaker, pattern, shift, and description.

Learn More: Gifted Guru

2. Annotate Texts

This handy guide breaks down the key elements of learning to annotate texts. Start by using the cards that have two stories in the same genre. Dissect these using the prompts. Next, give students two stories that are from different genres and have them discuss the differences.

Learn More: Teaching with a Mountain View

3. Annotation Symbols

Annotation symbols can be used to provide additional information or clarification about a particular text. Have your students pick up to 5 of these symbols to annotate another student’s work. Having them read others’ work is great practice and symbols make great annotation tools!

Learn More: Pinterest

4. Annotate Books

Before you can annotate a book, it’s important to read it actively. Meaning, engaging with the text, taking notes, and highlighting key points. This is key when teaching students about annotation. Start by asking your students to annotate a page from your class text. They can start by underlining keywords individually and then add more detail during class discussion.

Learn More: The Wordy Habitat

5. Rainbow Annotation

By teaching students to use different colored sticky notes they can easily scan an annotated text for specific information. Here, they have used red for angry emotions, yellow for funny, clever, or happy sections, and green for surprising moments. These can easily be adapted for any text. Work together as a class to make your own colored key to ensure a variety of annotations are used!

6. Annotation Bookmarks

Encourage a variety of annotations by handing out these cool annotation bookmarks. Easily kept inside student books, there will no longer be an excuse for forgetting how to annotate! Students can add some color to these bookmarks and match the colors when annotating a text.

Learn More: Ideas by Jivey

7. S-N-O-T-S: Small Notes on the Side

Reminding students not to forget their SNOTS is sure to help them remember to make Small Notes On The Side! Using a green, kids are taught to underline key points. They can then go back over the text to circle important words, add diagrams, and make notes of what they would like to include in their response.

Learn More: The Applicious Teacher

8. Projector and Whiteboard

By setting your camera above a text and displaying this on your whiteboard, you can show your students how to annotate in real-time. Go through the common steps involved in basic annotation and let them have a go at annotating their own text using the methods you have shown.

Learn More: Teaching Teens in the 21st

9. Label the Turtle

Younger kids will need to be exposed to the labeling process before learning to annotate. This cute sea turtle activity teaches kids the importance of using the correct labels in their written work. The turtle can also be colored in once the written work is complete!

Learn More: Homeschool Preschool

10. Annotate the Flower

Working with real-world materials is a surefire way to get kids engaged with their work! Using a flower, have learners label the different parts.  Additionally, they can complete a drawing of their activity and add labels and extra annotations to each part.

Learn More: Lessons 4 Little Ones

11. Practice Notetaking

Notetaking is a skill that almost everyone will require in their lifetime.  Learning to take good notes is key when learning to annotate texts. Have your students gather on the carpet with their whiteboards. Read a few pages from a non-fiction book and pause for them to write down important things that they’ve learned. 

Learn More: Primarily Speaking

12. Mind Map to Annotate

Here, the key points are choosing a central idea by drawing or writing a keyword in the center of a piece of paper. Then, branches are added for key themes and keywords. Phrases are the sub-branches and gaps and connections should be filled with more ideas or annotations. This simple process helps students plan their annotations.

Learn More: Chloe Burroughs

13. Create a Color Key

Encourage students to make the correct labels by using a colored key. The descriptions will vary depending on the type of text you are annotating. Here, they have used blue for general plot information and yellow for questions and definitions.

Learn More: Tumblr

14. Annotation Marks

These level annotation marks can be placed in the margin of students’ work when annotating to show key points. A question mark symbolizes something the student does not understand, an exclamation mark indicates something surprising, and ‘ex’ is written when the author provides an example.

15. Annotate a Transcript

Provide each student with a transcript of a Ted Talk. As they listen, they must annotate the talk with notes or symbols. These will be used to help them write up a review of the talk. 

Learn More: TED Talks

16. Annotation Station

This activity requires careful observation and attention to detail. It works best as a small group or individual assignment. It works well as an online method by using breakout rooms in Google Meet or Zoom. Provide your students with an image to annotate. Students can then add details and make observations about the image. If you have touchscreen devices, students can use the pen tool to draw on top of the picture. For non-touch devices, use the sticky note tool to add observations.

Learn More: Chromebook Classroom

17. Annotate a Timeline

This can be adapted to your class book or topic. Discuss an appropriate timeline and set groups of students to provide collaborative annotations for that part of the story or area of history. Each student must provide a key piece of information and a fact to add to the annotated timeline.

Learn More: Mr T does Primary History

Teaching Made Practical

  • Character Traits
  • Compare and Contrast
  • Read Alouds
  • Point of View
  • Reading Response Ideas
  • Summarizing
  • Text Features
  • Text Structures
  • Find the Fib
  • Reusable Ideas
  • Disclosure Policy
  • Lifetime Access
  • Research Project
  • Free Activity
  • BHM Biographies
  • 8 Activity Ideas

Ideas for teaching upper elementary students (3rd, 4th, and 5th grade students) how to annotate texts in a fun way

Annotating Text in 3rd, 4th, and 5th Grade

Written by Guest Blogger Jessica Thompson

I remember telling my class we were going to annotate the text. Wait. We are going to anno-what?

Very simply put, to annotate is to take notes or add comments. For students in third, fourth, and fifth grade, learning to annotate the text they are reading will be an imperative skill as they get into middle school, high school, and beyond.

It will help them develop proper study skills, connect to and remember the text, as well as learn to track their own thinking. They will comprehend at a deeper level. It’s a skill that is often overlooked and students are expected to know how to do, but one that we should model to guide them towards success.

Beginning to Annotate Text in Upper Elementary

Students can track their thinking by writing in the margins of the text (if it is a consumable) or do their note-taking in a notebook as they read.

As you begin to teach annotations to students be sure to keep it simple. Start with writing down important facts, thoughts, questions, and unknown words. The students can look up the unknown words at a later time.

Make sure upper elementary students understand why they should annotate. They should be able to explain that it is a way for them to better understand the text and develop a deeper understanding of their thinking. It is also a great way for students to monitor their own comprehension. 

Be sure to discuss how students tracked their thinking while reading, so they will get ideas from each other. As you teach new skills in class, apply these to the annotations as well. For instance, if you are teaching main idea, have students identify details that support the main idea. Theme, character traits, problem and solution, etc. The opportunities for students to annotate is endless!

Scaffolding for Student Success

Annotating text will seem tedious to 3rd, 4th, and 5th grade students as they first begin, but soon you will see the quality and the amount that they are writing in those margins increase.

Model annotating text first to ease the students’ minds. Then, annotate together before you let them try it on their own. This gradual release will have them feeling confident as they track their thinking.

After the students get in a routine of annotating, give them a passage to annotate with comprehension questions. They will annotate the text as they read (and reread hopefully!) then go to the questions. They will see the value in annotating when they realize the notes they took helped them answer the questions. Some of the notes may even be the answers they needed. When students begin to have aha moments, let them share their insight with the class.

Shake it Up and Make Annotations Fun

  • Use different colored pens. Let the students pick their favorite color, or all the colors,  to do their annotations. Kids love having a choice and this gives them some freedom. Your visual learners will love this!
  • Use post its. If you are using a text book or library book that cannot be written in, then use post it notes to stick on the pages. No post its? No problem. Let student’s take notes on construction paper or colored computer paper. 
  • Take a grade. This seems silly and so not fun, but when students realize that they can get a good grade annotating text, it really boosts them up. Their annotations should connect to the skills you are working on in class for it to make sense to take a grade. If you are teaching problem and solution, then their notes should identify the problem and events leading to the solution. 
  • Science. Seriously, science period. Annotations are not just for reading class, so find an excerpt, a poem, a story, or an article that goes along with the topic being taught in science class and let the students get to work. 
  • Choose your own text. Have a few options of articles, short stories, or poetry for students to choose from and let them annotate. Students will like choosing a genre and topic that they enjoy. Have options from sports, animals, and any other interesting topics that you know your students will enjoy.

As students get more confident annotating the text, you will begin to notice them doing it on their own. Yay! They will begin to annotate cold reads, homework, and tests and you will jump for joy as a teacher. The biggest challenge is time. Do not make annotations a separate lesson, but instead incorporate annotations into what you are already doing. Use text that you are already using in class. Work smarter not harder and you will see the benefits of annotating text through your students’ comprehension and higher level thinking.

Want This Constructed Response Freebie?

A short constructed response freebie with sentence starters, transition words, a reading passage, example questions, and more

Leave a Reply Cancel reply

You must be logged in to post a comment.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 21 February 2024

Quality assessment of gene repertoire annotations with OMArk

  • Yannis Nevers   ORCID: orcid.org/0000-0002-8604-2943 1 , 2 ,
  • Alex Warwick Vesztrocy 1 , 2 ,
  • Victor Rossier 1 , 2 , 3 ,
  • Clément-Marie Train 1 ,
  • Adrian Altenhoff   ORCID: orcid.org/0000-0001-7492-1273 2 , 4 ,
  • Christophe Dessimoz 1 , 2 &
  • Natasha M. Glover 1 , 2  

Nature Biotechnology ( 2024 ) Cite this article

2231 Accesses

41 Altmetric

Metrics details

  • Molecular evolution
  • Quality control
  • Sequence annotation

In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.

Sequencing many species from diverse taxa will drastically improve comparative genomics methods and our ability to elucidate when and how genes and species evolved 1 , provided the data truly reflect biological reality. This process necessitates rigorous quality control. Robust quality standards for genome assembly have been defined by sequencing initiatives, but improved metrics for genomic features, especially protein-coding genes, are needed 2 . These standards should assess gene repertoire completeness, accuracy of gene models, absence of misannotated non-coding sequences and contamination. A few methods, based on conserved gene markers, can be used to measure the completeness of a gene repertoire (for example BUSCO 3 , 4 , EukCC 5 , DOGMA 6 and CheckM 7 ) and to some extent contamination from other species (for example EukCC and CheckM). Other quality indicators, such as the UniProt Complete Proteome Detector, flag annotations with an unexpected number of protein-coding genes 8 . However, no existing methods estimate the extent of spurious annotation, which is common in publicly available genomes 9 .

We present OMArk, a method for eukaryotic proteome quality assessment. OMArk rapidly places query protein sequences into known gene families and compares them to the expected families of the species’ lineage. OMArk outputs multiple complementary quality statistics for the query proteome (Fig. 1a ). First, it estimates the completeness, based on the proportion of expected conserved ancestral genes present. This is similar to BUSCO but also considers conserved multicopy genes. Second, OMArk estimates the taxonomic consistency (i.e., the proportion of protein sequences placed into known gene families from the same lineage). Sequences placed into gene families from other taxa or not placed at all may be contaminant or erroneous sequences. Thus, OMArk assesses proteome quality by evaluating not only what is expected to be there but also what is not expected to be there—contamination and dubious proteins. This feature, to our knowledge, is not fully provided by any existing methods. We demonstrate OMArk’s accuracy in estimating multiple quality metrics on proteomes with artificially introduced errors and in real-use cases.

figure 1

a , Schematic overview of the OMArk concept and output. OMArk provides two main quality assessment categories: completeness assessment and consistency assessment. Completeness assessment is based on the overlap of the query proteome with a conserved ancestral gene set of the species’ lineage. OMArk classifies genes in the query proteome that are found in a single copy or multiple copies (duplicated) or missing. Completeness assessment is similar to methods like BUSCO but also considers conserved genes that are in multiple copies. Consistency assessment is based on the proportion of query proteins placed in gene families of the correct lineage (consistent), gene families of an incorrect lineage (either randomly (Inconsistent) or to specific species (contamination)) and placed in no gene families at all (unknown). b , An example of OMArk’s graphical output for the model organism zebrafish ( Danio rerio ). The top of the stacked bar plot represents the completeness assessment and shows genes that are found in a single copy (dark green; here: 91.78%), duplicated (light green, 4.85%) or missing (red, 3.37%). The lower part of the bar plot represents the consistency assessment and shows taxonomically consistent genes (blue, 96.31%), taxonomically inconsistent genes (violet, 0.65%), contaminants (orange; none in this example) or genes with no detected homology (unknown; black, 3.04%). All categories annotated with hashes correspond to the proportion of partial mappings (black hashes) and fragmented genes (white hashes).

Source data

Software overview.

OMArk is available as an open source command-line tool and a web server. The command-line tool is distributed as a python package on Anaconda, PyPI and GitHub ( https://github.com/DessimozLab/OMArk ). In addition to a query proteome, it needs only a precomputed OMAmer database, which is available for download from the OMA browser 10 .

The web server ( https://omark.omabrowser.org ) lets users upload a FASTA file of their proteome of interest and visualize or download the results once the computation is done, typically within 35 min for a proteome of 20,000 sequences. Additionally, users can interactively browse and compare precomputed OMArk results for over 8,000 annotation sets from the National Center for Biotechnology Information (NCBI), Ensembl and UniProt.

Query protein placement

OMArk takes as input a proteome FASTA file in which each gene is represented by at least one protein sequence. OMArk starts with OMAmer 11 , a fast k -mer-based method that assigns proteins to gene families and subfamilies (Fig. 2a ), represented as hierarchical orthologous groups (HOGs) 12 . These gene families are predefined in the OMA database 13 using over 2,500 species but could in principle be used with other databases using the HOG concept.

figure 2

a , Sequences from the query proteome are placed into known HOGs using the k -mer-based fast-mapping method OMAmer. Shown is a gene tree with nested gene families (HOGs), delineated by speciation and duplication events. OMAmer provides accurate placement of protein sequences in their correct subfamily. b , The specific taxon of the query species is automatically determined by OMArk. Here, the species tree is shown, with protein placements represented by red dots. The size of the dots is logarithmically proportional to the number of placements in a typical scenario but simplified for this schema. The path to the query taxon (blue) is inferred based on the maximal number of placements, and the path(s) to contaminant taxa (gold) are determined as those with more placements than expected by chance. c , OMArk defines the ancestral reference lineage for a given query species as the most recent taxonomic level, including the species, and that is represented by at least five species in the OMA database. Here, a species tree is shown with colored bars representing individual genes. d , The conserved and lineage-specific gene sets. The conserved repertoire contains all the HOGs defined at the reference ancestral level that cover at least 80% of the species in the clade. These are gene families inferred to be present since the common ancestor. The lineage repertoire is a superset of the conserved repertoire, with the addition of genes that originated later in the lineage and are still present in at least one species in the OMA database. In the repertoires, genes from the different species are grouped into their HOGs. e , OMArk assesses completeness by comparing the conserved ancestral repertoire to the query protein sequences and classifying them as single copy, duplicated or missing. f , OMArk assesses consistency by comparing the query protein sequences to the lineage repertoire and classifying them as taxonomically consistent, inconsistent, unknown or contaminant. OMArk also assesses gene model structure by classifying query proteins as partial mapping or fragment. Shapes of species shown in a and b reprinted from Phylopic ( www.phylopic.org ). Silhouettes of Homo sapiens and Canis familiaris dingo by T. M. Keesey (public domain), Pongo abelii by Gareth Monger ( CC-BY 3.0 ), Pan troglodytes by J. Lawley (public domain) and Xenopus laevis by Ian Quigley ( CC-BY 3.0 ). Silhouettes of Saccharomyces cerevisiae by W. Decature (public domain), Laccaria by R. Percudani (public domain), Caenorhabditis elegans by J. Warner (public domain) and Mus musculus by S. Miranda-Rottman ( CC-BY 3.0 ).

Species identification

To infer the species composition of the query proteome, OMArk tracks the protein placement into gene families and their taxa of origin (Fig. 2b ). Ideally, a species’ proteome will have placements only into gene families from its ancestral lineage. For example, human genes will have originated at the common ancestor of primates, mammals and vertebrates, but not rodents. OMArk starts from this assumption and identifies paths in the species tree where placements are overrepresented, and it then selects the most recently emerged clade as the inferred taxon. If multiple paths are overrepresented, OMArk reports the most populated as the main taxon and all others as contaminants.

Ancestral reference lineage identification

Based on the main taxon placement, or a user-specified taxonomic identifier, OMArk selects an ancestral lineage: the most recent taxon that contains the species of interest and at least five species in the OMA database (Fig. 2c ). The selected ancestral lineage is provided in OMArk’s output.

Completeness assessment

OMArk selects all gene families that were present in the common ancestor of the ancestral lineage and still are present in at least 80% of its extant species (conserved repertoire; Fig. 2d ). The presence of these gene families serves as a proxy for its proteome completeness. OMArk reports the number of selected gene families, their identifiers and the proportion of the conserved gene families that are found in the query proteome as a single copy or duplicated (multiple copies) or are missing (Fig. 2e ). An incomplete proteome would have a high proportion of missing gene families.

Contrary to BUSCO, the conserved genes are not necessarily expected to exist in single copies in extant genomes, although they were likely a single gene in the lineage’s ancestor. Thus, duplicated genes are classified as ‘expected’ if they correspond to a known duplication that occurred after the ancestral lineage’s speciation or ‘unexpected’ otherwise. If the ancestral lineage has a lower ploidy level than the query species due to subsequent whole-genome duplication (WGD; for example, ancestral diploid compared to a tetraploid), then the query proteome will appear as massively duplicated. Users should interpret the results in the context of their query species’ ploidy.

Consistency assessment

The main advantage of OMArk is that it evaluates the consistency of all the genes in the query proteome compared to what is known for its lineage, both taxonomically and structurally.

Taxonomic consistency classifies query proteins based on their taxonomic origin by comparing them to the lineage’s known gene families (lineage repertoire; Fig. 2d ). Proteins fitting this lineage repertoire are classified as consistent, whereas those that fit outside are classified as either inconsistent or contaminant (Fig. 2f ). The contaminant category contains all inconsistent placements that are closer to a contaminant species than to the main species, as determined by the species identification step. Proteins with no gene family assignment are classified as unknown.

Structural consistency classifies query proteins based on sequence feature comparisons with their assigned gene family. Proteins only sharing k -mers with their gene families over part of their sequence are labeled partial mappings, whereas proteins with lengths less than half their gene family’s median length are labeled fragments (Fig. 2f ).

Taxonomic and structural consistency are complementary parts of the consistency assessment performed over the whole proteome and help identify annotation errors, a feature lacking in most quality assessment methods. A proteome with a high proportion of consistent proteins indicates more reliable annotation. Conversely, a high proportion of partial mappings and fragments indicates potential gene model inaccuracies. Inconsistent proteins suggest either gene families not previously identified in the target clade or, if they are primarily partial or fragments, sequences with biased composition. Similarly, unknown proteins may be sequences without close homologs or annotation errors. Thus, not all proteins classified as inconsistent or unknown are necessarily errors, but an unusually high proportion may indicate a systematic error in the annotation.

An example of the OMArk output for the Danio rerio proteome shows it has a high completeness (96.6%) and consistency (96.3%), as expected for a well-curated model species (Fig. 1b ).

Validation on simulated proteomes

To evaluate OMArk’s ability to provide accurate quality assessment, we simulated cases of genome incompleteness, erroneous sequences, gene fragmentation or fusion, and cross-species contamination. We used two datasets of eukaryotic proteomes (Supplementary Table 1 ): a dataset comprising nine model species known for their high quality (model dataset) and a dataset including 16 species representing eukaryotic diversity and absent from the reference OMA database (representative dataset).

Simulated incompleteness

For each proteome in the datasets, we simulated incompleteness by removing varying percentages (10%–90%) of random proteins. OMArk’s results closely approximate the simulated completeness in most cases, although it tends to overestimate it (Fig. 3a and Supplementary Figs. 1 and 2 ). The error margin is lower in the model dataset (+2.3% on average) than in the representative dataset (+9.9% on average). For both datasets, OMArk’s performance is similar to BUSCO’s, but BUSCO overestimates completeness by a slightly smaller margin (+2.1% and +6.1% on average for the model and representative datasets, respectively; Supplementary Figs. 1 and 2 ).

figure 3

a – d , Three example species of the model dataset (left) and the representative dataset (right) are shown for each simulation. Each simulated error in panels a–d was applied to 10%–90% of the proteome ( x axis). a , Simulated incompleteness. OMArk (top) and BUSCO (bottom) results for the datasets. Colors represent the part of the conserved gene set found in a single copy (green) or duplicated (light green) or are missing (red). The simulated completeness corresponds to the percentage of the genome that has been randomly selected in each simulation. Horizontal black lines show the expected completeness (that is, the measured completeness for the source proteome). b , Erroneous sequence simulation. Colors represent proteins which map to the correct lineage (consistent, blue), to another lineage (Inconsistent, violet) or have no homologs (unknown, black). Hashes indicate structural inconsistency relative to the gene family (either partial mapping (black hashes) or fragmented genes (white hashes)). The appended error ( x axis) corresponds to the quantity of erroneous sequences that was added to the proteome as a percentage of its original protein number. Horizontal red lines indicate the expected number of structural and taxonomically consistent genes, considering the proportion in the source proteome and the known introduced error. c , Fragmented sequence simulation. The x axis corresponds to the percent of the proteome that has been fragmented. The pool of artificially fragmented genes are cut randomly to be between 10% and 90% of the original length of the protein. Horizontal red lines indicate the expected number of nonfragmented taxonomically consistent genes, considering the proportion in the source proteome and the known fragment rates; horizontal pink lines indicate this proportion if half of the fragments are detected. d , Fused sequence simulation. The x axis corresponds to the percent of the proteome that has been fused. Pairs of proteins are selected randomly and appended together to simulate fusion. The fused protein gets added to the proteome while the original proteins get removed. Horizontal red lines indicate the expected number of structural and taxonomically consistent genes, considering the proportion in the source proteome that have been fused. Ancestral lineages for the six shown species are Homo sapiens , Hominidae; Drosophila melanogaster , melanogaster subdivision; Arabidopsis thaliana , Brassicaceae; Mytilus coruscus , Lophotrochozoa; Reticulomyxa filosa , SAR (Stramenopiles-Alveolata-Rhizaria) supergroup; and Hibiscus syriacus , Malvaceae.

Both methods overestimate completeness in species with a high number of duplicated genes. This effect is expected, as reporting them as missing requires all copies to be absent. This trend is more pronounced in OMArk, because OMArk does not require conserved genes to be in a single copy in extant species, resulting in a more inclusive set of conserved gene families. Thus, because OMArk reports more duplicates, it overestimates completeness more than BUSCO. This trend is observed in both datasets but especially the representative dataset, as these proteomes have a higher average proportion of duplicated genes (8.4% for the representative dataset versus 2% for the model dataset).

This high level of detected duplication in the representative dataset can be explained by the selected ancestral lineages, which are more distantly related than those selected in the model dataset. Thus, ancestral gene families in the representative dataset may have had more time to undergo duplication. Furthermore, WGD events that occurred after the ancestral lineage can lead to high levels of reported duplication in OMArk and BUSCO. A striking example is the Hibiscus syriacus proteome, where OMArk reports nearly 70% of the genes as duplicates. These results are due to H. syriacus being a tetraploid, having undergone two WGD events after the last Malvaceae common ancestor 14 . Because the Malvaceae clade was selected as the ancestral lineage by OMArk, the higher number of duplicates corresponds to the genes that were retained as two copies or more after the WGD.

Simulated erroneous sequences

We simulated erroneous sequences by adding randomly generated sequences, from 10% to 90% of the proteome, to each proteome in the model and representative datasets. As a result, there was a corresponding increase in the proportion of Unknown proteins, given that these added sequences lacked detectable homologs (Fig. 3b ). In all simulations, OMArk detected the expected proportion of taxonomically and structurally consistent genes, indicating that this category accurately represents the proportion of high-confidence coding sequences. Results were similar whether the sequences were generated from random nucleotides or designed to resemble the target species’ proteins (Supplementary Results: Simulation Results and Supplementary Figs. 3 – 6 ).

Simulated fragmentation

We simulated fragmented proteomes by randomly selecting sequences and then randomly removing between 10% and 90% of their length, ranging from 10% to 90% of the proteome. OMArk identified an increasing proportion of fragmented, taxonomically consistent proteins, reaching up to half the known number of fragmented sequences. This result is expected, as OMArk only identifies fragments that are less than half the gene family’s median protein length and thus will not detect fragments that are 51% to 90% of the original protein size. Given the modified expected fragmentation detection rate (only half the simulated fragments), there is only a slight underestimation of consistent, nonfragmented proteins: 0.6% for the model dataset and 1.8% for the representative dataset (Fig. 3c and Supplementary Figs. 7 and 8 ). We also detected a slight increase in unknown proteins, possibly because these fragments are too short to be detected as homologs of existing genes.

Simulated fusion

We simulated cases of fused protein-coding genes by merging pairs of randomly selected proteins, ranging from 10% to 90% of the proteome, and added them to the proteomes while removing the original proteins. We expected that OMArk would associate these fused proteins to one of the existing HOGs but as a partial match, as only part of the sequence would be in common with the HOG. However, the increase in partial mappings as the proportion of fused genes rises was less than expected. The proportion of structurally and taxonomically consistent genes was on average 17.6% higher than expected for the model dataset and 13% higher than expected for the representative dataset (Fig. 3d and Supplementary Figs. 9 and 10 ).

Simulated contamination

We simulated contamination by introducing sequences from bacteria, fungi, microbial eukaryotes or humans to the model and representative datasets. OMArk accurately identified the taxonomic origin of the contaminant, though its sensitivity varied, especially with a low number of contaminant proteins. For bacterial and fungal sources, contamination became detectable with as few as ten contaminant proteins, corresponding to ~10 kbp contaminant bacterial DNA or ~25 kbp fungal DNA. Contamination was reliably detected at 50 or more contaminant proteins (~50 kbp bacterial DNA, ~125 kbp fungal DNA). However, for other eukaryotic species, precise contamination detection required at least 100 to 200 contaminant proteins (~200–700 kbp free-living unicellular eukaryote DNA). OMArk missed contamination when the contaminant had no close relative in OMA or was too closely related to the contaminated species (Supplementary Table 2 ; Supplementary Results: Contamination simulation ). Specifically, OMArk only detected human sequence contamination in vertebrates at high levels (1,000 proteins; ~150 Mpb human DNA) and not at all in mammals.

OMArk results for 1,805 eukaryotic reference proteomes

Comparing protein-coding gene annotations between closely related species, including one ‘gold standard,’ is essential to assess annotation quality 2 . Thus, we ran OMArk on a set of 1,805 Eukaryotic UniProt proteomes to serve as a reference dataset (Fig. 4 and Supplementary Table 3 ). We provide quality assessments for major clades and detailed analyses of specific proteomes with low-quality results in Supplementary Results: Results on UniProt Reference Proteomes . All results can be visualized on the OMArk web server ( https://omark.omabrowser.org ) and compared to those of closely related species.

figure 4

Bar graphs of the number of canonical proteins in each proteome (top). Completeness assessment showing the proportion of conserved genes (green) in the proteome, with a breakdown among single-copy (light green), duplicated (dark green) and missing (red) genes in all proteomes (middle). Consistency assessment showing the proportion of accurately mapped proteins (consistent; blue), incorrectly placed proteins (Inconsistent; purple), contaminant proteins (orange) and proteins with no mapping (unknown; black) (bottom). Genomes are ranked by taxonomy, with major eukaryotic taxa shown on a taxonomic tree at the bottom.

OMArk and BUSCO comparison

We compared OMArk and BUSCO for assessing completeness for the 1,805 Eukaryotic UniProt Reference Proteomes. We define completeness as the total percentage of conserved genes from either BUSCO or OMArk that are classified as single copy, duplicated copies or fragments in the query proteome (that is, not missing). Note that this differs from BUSCO’s definition of completeness, which does not include fragments. OMArk and BUSCO yield similar results overall, with a Pearson correlation of 0.86 for completeness across the 1,805 proteomes (Fig. 5 ). Disparities are expected, as OMArk considers both single-copy and multicopy genes, whereas BUSCO is restricted to single-copy genes. For 57% of the proteomes, BUSCO versus OMArk completeness differed by 5% or less. Where the difference was larger, proteomes considered more complete by OMArk typically exhibited more fragments, indicating OMArk’s ability to identify fragmented proteins without categorizing them as missing.

figure 5

Each point on the scatterplot is one of the 1,805 Eukaryotic UniProt Reference Proteomes assessed by both methods. The x axis is the percentage of the conserved set of ancestral genes found in the query proteome by OMArk. The y axis is the percentage of BUSCO genes found in the query proteome by BUSCO. Both completeness scores include duplicated and fragmented proteins. Proteomes are colored by the percentage of fragments found in the proteome, as determined by OMArk.

The proteome’s lineage also influenced the disparity in completeness scores between BUSCO and OMArk. Certain BUSCO lineages, such as Liliopsida and Stramenopiles, were often deemed as more complete by BUSCO, whereas lineages such as Aves and Nematoda tended to be deemed as more complete by OMArk (Supplementary Fig. 14 ). This bias may stem from the number of ancestral genes assessed, as fewer BUSCO genes or conserved HOGs generally resulted in higher BUSCO completeness. Conversely, a higher number of BUSCO genes or conserved HOGs resulted in higher OMArk completeness. Additionally, when OMArk deemed a proteome as more complete, the OMA database typically had fewer species in the relevant clade than for proteomes where BUSCO estimated a higher completeness (Supplementary Table 7 ). Thus, the lineage and consequently the number of conserved genes used for assessment affects completeness in both BUSCO and OMArk. A larger set of conserved genes and more species in the lineage of interest likely lead to more accurate completeness assessments.

Runtime comparison over the same set of proteomes showed OMArk is generally faster in terms of total CPU time, with an average of 9.2 min per proteome for OMArk versus 25.2 min per proteome for BUSCO for all 1,805 proteomes. BUSCO’s runtime largely depends on the number of BUSCO genes used in the assessment, whereas OMArk’s runtime depends mainly on the number of proteins in the query proteome.

These results highlight the biases inherent in each tool. Ultimately, we advise to use both software packages to obtain the most informative gene repertoire quality assessment. More comparisons are detailed in the Supplementary Results: Comparison with BUSCO on UniProt Reference Proteomes .

Contamination in public databases

OMArk detected 124 contamination events across 79 of 1,805 proteomes, some with multiple contaminating species (list in Supplementary Table 4 ). Two of them, Ricinus communis and Lupinus albus , were found to be contaminated by ten and seven species, respectively (mostly bacteria and one fungus), indicating that extreme cases of contamination persist in public databases. We independently verified each contamination case using BLAST and BlobToolKit Viewer (Supplementary Table 4 ) and confirmed 117 (93.6%) of the contamination events in 73 species.

Error propagation in some avian proteomes

We detected widespread presence of fragmented genes in the 234 avian species from the UniProt Reference Proteomes (median proportion of taxonomically consistent fragments: 18.3%, standard deviation: 4.8%). However, this was not observed in well-studied birds such as chicken ( Gallus gallus ; proportion of taxonomically consistent fragments: 2.4%; Supplementary Fig. 18 ). The proportions of fragments depended mainly on the source of the proteome. Most of the highly fragmented proteomes originated from the same source, the Bird 10 K consortium annotation pipeline 15 , and tended to have fragments in the same gene families, suggesting systematic bias (Supplementary Figs. 19 and 20 ; Supplementary Results: Analysis of avian proteomes ). Annotations for these genomes were performed using, among other sources of evidence, homology from the Ensembl 85 (ref. 16 ) annotation of zebra finch 15 ( Taeniopygia guttata ; taeGut3.2.4 assembly). OMArk also detected a high proportion of fragments in this older version of the zebra finch proteome (proportion of taxonomically consistent fragments: 20.3%), but not in the latest version (0.5% of taxonomically consistent fragments; Ensembl 99 + ; bTaeGut1_v1.p assembly). Furthermore, a high proportion of genes fragmented in the Bird 10 K proteomes were also fragments in the older zebra finch proteome (Supplementary Fig. 21 ). These results suggest fragments in these bird proteomes likely result from propagation from the fragmented taeGut3.2.4 proteome.

Selection of high-quality proteomes among close species

OMArk’s quality assessment depends on the selected ancestral lineage. Thus, a best practice is to compare the results to species sharing the same ancestral lineage. We illustrate this by comparing the OMArk results of a model species, Mus musculus , with its close relatives within the Myomorpha clade, a group of mouse-like rodents (Fig. 6a ). As expected, the well-curated species Mus musculus and Rattus norvegicus scored best, both in completeness and consistency. Several other species in the clade exhibited noticeable quality issues, despite being in the OMA database and contributing to the ancestral reference HOGs (for example, Cricetulus griseus ). We observed similar patterns for other model organisms consistently ranking as the best proteomes in their clade (detailed in Supplementary Results: Comparison of proteomes from closely related species ; Supplementary Figs. 22 – 30 ).

figure 6

a , OMArk results for different Myomorpha proteomes. The species currently in OMA which contributed to the ancestral reference lineage gene families are shown with an asterisk. Mouse ( Mus musculus ) and the Norwegian rat ( Rattus norvegicus ) stand out as high-quality annotations. b , OMArk results for Bombus impatiens , comparing two different versions of the assembly. The newest assembly (version 53) shows an improvement in gene set consistency, with a slight decrease in completeness, although the number of proteins substantialy decreased.

These results demonstrate OMArk’s ability to identify the best-quality proteome in any clade of interest, which is useful for selecting representative genomes and for improving annotation of nonmodel species.

Assembly and annotation comparisons

OMArk can be used to compare gene repertoires from different assemblies or annotations of the same species, aiding in benchmarking annotation methods or gauging improvement in gene repertoire completeness and consistency over time. To illustrate, we ran OMArk on newer versus older assemblies or annotations for species with documented changes between the Ensembl Metazoa releases 53 and 54 17 . This corresponds to 11 protostome species with annotations on different assembly versions and seven nematode species with different annotations on the same assembly (Supplementary Table 5 ).

When comparing OMArk results across different annotation versions of the same assembly, we observed minor changes (less than 1% for most metrics), likely due to incremental annotation updates affecting few genes. Nevertheless, we still detected a trend toward fewer duplicated genes and more consistent genes ( Supplementary Results: Assembly and annotation comparisons ).

Comparing annotations on different assemblies, OMArk detects noticeable improvement in completeness and/or in structurally and taxonomically consistent genes for all but one species, but not always in both ( Supplementary Results: Assembly and annotation comparisons ; Supplementary Table 9 ). For instance, B. impatiens and Acyrthosiphon pisum showed a slight decline in completeness (−1.21% and −0.73%, respectively) but a large rise in taxonomically and structurally consistent genes (+17.34% and +21.06%, respectively; Fig. 6b ). In contrast, Crassostrae gigas exhibited an increase in completeness (+4.38%) and a decrease in consistency (−9.16%).

OMArk also detected the removal or decrease in contamination for three species ( Schistosoma mansoni , A. pisum and Glossina fuscipes ), as well as new contamination introduced in Teleopsis dalmanni’ s latest assembly. Most of the observed changes had no clear correlation with improvement in assembly quality metrics, except the proportion of fragmented genes decreasing with a higher N50 (Pearson correlation: 0.85, P value: 0.002). Our results indicate that new assemblies generally improved gene set quality, changed contamination status and reduced fragmented gene models due to higher assembly contiguity. However, these new assemblies were not necessarily annotated in the same way, making it difficult to discern whether observed changes are due to improved assemblies or to improvements in annotation procedures.

Finally, we compared 1,200 pairs of protein-coding gene annotations, each pair including one annotation from Ensembl and the other from the NCBI (GenBank and RefSeq), both derived from the same assembly. We analyzed the differences in OMArk and BUSCO results for all these pairs of annotations (Supplementary Tables 6 and 8 and Supplementary Fig. 32 ). NCBI proteomes generally exhibited higher completeness (+1.39%), fewer proteins with no known homologs (−0.64% unknown) and fewer structurally inconsistent proteins (−0.18% partial mapping and −0.64% fragments). Conversely, Ensembl proteomes exhibited a slightly lower taxonomic inconsistency (−0.09%).

Because OMArk’s underlying OMA database predominantly sources its proteomes from Ensembl (74% of Eukaryotic proteomes, Supplementary Fig. 31 ), we hypothesized this might introduce a bias. We tested this by comparing results on a subset of annotation pairs from species in OMA sourced from Ensembl to the rest of dataset. In this subset, proteomes from Ensembl had fewer detected fragments (−0.27%), fewer partial mapping proteins (−0.28%) and fewer taxonomically inconsistent proteins (−0.28%) than NCBI proteomes. These differences confirm that OMArk is slightly biased due to the reference proteomes’ origin. Thus, NCBI proteomes may appear slightly worse than they actually are, not necessarily due to quality issues but due to discrepancies in gene models predictions compared to Ensembl. However, the quantitative impact of such bias is minimal and unlikely to obscure any major annotation quality issues.

Overall, our findings highlight OMArk as a valuable tool for tracking improvements in genome assembly and annotation. By analyzing other metrics beyond completeness, OMArk can detect changes toward overall better gene sets, even when the completeness decreases. Furthermore, OMArk is effective for comparing different methods or sources of annotation, although users should note that minor differences between proteomes could be attributed to a bias induced by OMArk’s reference proteomes.

OMArk, leveraging the OMA database and k -mer-based fast gene family placement, evaluates the quality of protein-coding gene annotations. Our results on simulated incomplete genomes and on real proteomes demonstrates OMArk’s completeness measure is comparable to BUSCO. This finding is not surprising, as both methods assess the presence or absence of near universally conserved genes in a lineage. However, there are several key differences. OMArk not only focuses on single-copy conserved genes but also includes gene families that underwent duplication. Second, BUSCO uses hidden Markov model profiles to map query genes to their conserved gene families, a method more accurate but slower than the k -mer mapping exploited by OMArk. Finally, OMArk does not rely on a prespecified dataset of conserved genes but automatically chooses them depending on the query species’ taxonomic lineage.

OMArk assesses proteome consistency using a broader selection of orthologous groups than conserved ones, and proteins placed into gene families that are taxonomically consistent with the species of interest can be more confidently considered as true coding genes. Moreover, we can assess the quality of their gene structures by comparing to known sequences in the same family. However, there are a few caveats when interpreting OMArk results: gene consistency with the same lineage is expected only in species with predominantly vertical gene family inheritance and if the chosen family is well sampled and of good quality in our reference database.

Like most orthology databases, OMA, OMArk’s reference database, has uneven taxonomic sampling. For instance, mammals are overrepresented relative to total biodiversity, whereas free-living unicellular eukaryotes are underrepresented. OMA is actively maintained and has a release cycle of under a year, focusing on improving coverage for underrepresented species while including only high-quality data. Consequently, OMArk’s resolution is expected to improve as more diverse genomes are included. When choosing a reference lineage, OMArk selects the most specific clade with a sufficient number of species. However, an excessively broad clade may lack accuracy (most genes being consistent and few genes needed for completeness), whereas an excessively narrow clade may not be generalizable. OMArk issues warnings for ancestral lineages at the genus level or below and the phylum level and above and allows users to select the taxonomic rank for an ancestral lineage. We recommend that users be mindful of OMA’s species coverage for the ancestral lineage and interpret the results critically in this context.

OMArk’s completeness and consistency metric assume that proteomes in OMA accurately reflect the ‘real’ gene content of the species in the clade, which may not always hold true beyond a few highly curated species. Any proteome will likely carry bias from its annotation method. Such a bias impacts OMArk because many eukaryotic proteomes in OMA were downloaded from Ensembl. Although these are high-quality annotations, OMArk consequently is slightly biased towards Ensembl’s and similar gene prediction pipelines. Comparisons with NCBI show that some proteins not predicted by these pipelines will appear as inconsistent, and other valid gene structure predictions may be classified as partially mapping or fragmented. Although this effect is minor, users should be careful of such bias when comparing annotations. Finally, OMArk reports possible contamination, which should help genome annotators to flag contamination cases and reassess their genomic data. However, users should be aware of a few caveats. OMArk has a low sensitivity to contamination from human sequences or from eukaryotes from lowly sampled clades, and it is limited to coding regions. Furthermore, OMArk cannot discriminate between contamination and recent horizontal gene transfer. Using the list of potential contaminants, annotators can identify the corresponding contigs in the genome assembly for validation. Nevertheless, we recommend using assembly-level dedicated methods 18 such as BlobToolKit 19 to perform in-depth analysis and correction of genome assemblies.

OMArk provides a comprehensive proteome quality assessment, aiding annotators in improving gene annotation and enabling users to select high-quality proteomes for their investigations. We hope OMArk will help improve the quality of existing and newly produced gene sets, advancing the field of genomic research.

OMAmer placement

The OMAmer database used in this study was generated from the November 2022 release of the OMA database 10 . Placements were made with OMAmer version 2.0.0, using default parameters. Root HOGs (that is gene families) with five or less proteins and a species coverage (the proportion of species in the clade with a gene in the HOG) lower than 0.5 were excluded, as they are most likely spurious.

Overview of the OMArk algorithm

All analyses shown here were performed with OMArk version 0.3. The OMArk software takes the following as minimum input: 1) the output of the OMAmer placement for a whole proteome, whereby proteins of these proteomes are placed in HOGs, and 2) the path to the corresponding OMAmer database.

Optionally, OMArk can take the NCBI taxonomy ID of the proteome’s species which will be used to select its ancestral lineage; otherwise, its taxa will be inferred automatically (see 'Automatic species identification and contamination assessment' below). The FASTA file of the query proteome is also an optional input, which may be used to generate output FASTA files for inconsistent, contaminant and unknown proteins. Finally, if the proteome contains multiple isoforms per gene, an additional option (-i) allows the user to provide a comma-separated file where all protein identifiers corresponding to a single gene are written on the same line. Only one isoform will be selected for completeness and consistency assessment, based on the OMAmer placement score as detailed in the section ‘Isoform selection’ below.

Isoform selection

If the target proteome contains more than one protein by gene, and an isoform file was provided by the user, OMArk will automatically select the sequence with the best match in the OMAmer database. This selection is based on the hit’s ‘family P ’ (from OMAmer), which represents the negative natural logarithm of the P value of having as many or more k -mers in common under a binomial distribution. This helps ensure that gene model comparison will happen between similar isoforms. OMArk selects the isoform with the lowest P value as the isoform of reference. The list of selected isoforms is then provided in OMArk’s output in a file with the suffix _selected_isoform.txt.

Automatic species identification and contamination assessment

The taxonomic distribution of HOGs in the query proteome can be used to automatically detect the species from which they come from. OMArk does this by using the nonredundant list of HOGs in which proteins of the query proteomes were placed and extracting the taxonomic level where each HOG is defined (that is, the taxonomic node after emergence or duplication of the gene family). This step is used to obtain the number of mappings to each clade in the tree of life, which we call clade occurrence N .

To reduce noise due to incorrect mapping, which is more common in broad clades with a large number of HOGs, we divide the clade occurrence by the number of total HOGs defined at each level to obtain a normalized clade occurrence N . In the presence of only one species and no noise, we assume the most likely placement would be the clade with the highest normalized clade occurrence, with all its parent clades having an equal or lower normalized occurrence count.

The OMArk algorithm uses this assumption and implements a few corrections to account for noise in HOG placement and allow for more than one species in the proteome, in case of contamination. First, all clades with an occurrence of more than two are used to construct a simplified taxonomic tree containing only branches leading to these clades. The tree structure itself is derived from the OMA underlying taxonomy, which used until now the NCBI taxonomy 20 .

The OMArk algorithm for species identification is a recursive postorder traversing function. At each leaf, it returns the leaf clade as likely placement, with occurrence N leaf and normalized occurrence N ′ leaf of the taxonomic level. At each node, it compares the occurrence scores of the current node to the most likely placements of its children. To be considered relevant, a child’s placement has to satisfy:

\({N}_{{{\mathrm {child}}}} > {N}_{{{\mathrm {node}}}} \times {D}_{{{\mathrm {node}}},{{\mathrm {child}}}}\) , where D x,y represents the proportion of HOGs defined in clade x that have a child defined in clade y . A high value represents a high duplication rate in the branch leading from x to y. This condition controls for high duplication numbers in the branch leading to some lineages (for example, ancestral WGD), which favor overspecific placement into those clades.

\({N^{\prime}_{\mathrm {child}}} > {N^{\prime}_{\mathrm {node}}} \times \vert {S}_{{{\mathrm {child}}}} \vert / \vert {S}_{{{\mathrm {node}}}} \vert\) , where |  S x | is the number of species in clade x . This condition controls for sampling imbalance in lineages that favor overspecific placement in larger clades.

If only one child is considered relevant, it is returned as the most likely taxon. If more than one is considered relevant, all are returned as likely taxa. If no child is considered relevant, only the current node is returned as likely taxon. After traversal, this module outputs a list of independent clades which have more hits than expected by chance.

For each clade with more placements than expected, we select all proteins that can be unambiguously attributed to these clades (that is, all proteins that map to a HOG defined at a node in the subtree leading to its first common ancestor with any other clade in the list). The clade with the most proteins is considered as the most likely main taxon and the other as contaminants. In the case when OMArk detects multiple possible contaminant species for a protein based on its placement, it will report the protein as ‘ambiguous contaminant’ sequences. This feature possibly overestimates the proportion of contaminant sequences, especially in presence of spurious hits (that would otherwise be in the inconsistent category), but ensures most of the contaminants are included in the category.

The completeness assessment measures the proportion of HOGs that are expected to be conserved in the species’ lineage. This assessment is done by first selecting the ancestral lineage of the species, defined as the most recent taxonomic level including the species and represented by at least five species in the OMA database. Then, OMArk defines the ancestral ‘conserved repertoire’ of the query species: all the HOGs defined at this ancestral level that cover more than 80% of the species in the clade.

Because a HOG at the selected taxonomic level represents a single ancestral gene, conserved HOGs are classified as one of the following:

Single copy if one protein in the query proteome maps to it. To be robust to minor errors in phylogenetic placement, a single underspecific hit (placement in a parent HOG; Fig. 2a ) or a single overspecific hit (placement in a child HOG) is sufficient to consider a conserved HOG as single copy.

Duplicated if more than one query protein maps to it. A duplicated, conserved HOG is further classified as unexpected if multiple proteins are all placed into the ancestral HOG itself (that is, no evidence of such duplication exists in the OMA database) or expected if the proteins were placed into subfamilies of the HOG (that is, the duplication event is documented in the database).

Missing if no proteins in the query genome are placed into it.

The consistency assessment evaluates the query proteome quality, again depending on the placement of its proteins into HOGs and the taxonomic level at which these HOGs are defined. Here, OMArk uses a ‘lineage repertoire’ of the query species: all the HOGs from the conserved ancestral repertoire plus those that originated later on and are still present in at least one species of the lineage. It uses this lineage repertoire to classify proteins as:

Unknown proteins are those that were not placed into existing HOGs. They correspond to either errors in the annotation or to gene families with no detectable counterpart in OMA (due to falling in sparsely sampled clades or being a novel protein).

Consistent proteins are those that were placed into a HOG consistent with the reference lineage: the HOG has a representative of at least one species from the lineage, whether it was present in the common ancestor of the reference lineage or emerged in its descendants.

Contaminant proteins are those that map to a lineage of another species which has been detected as a likely contaminant by the contaminant detection module of OMArk (see ‘Automatic species identification and contamination assessment’ in Methods ).

Inconsistent proteins are those that were placed into HOGs from other parts of the tree of life and for which there is no evidence the gene families existed in the selected lineage or in any contaminant lineage. They are likely to correspond to gene families that were not observed in those species before or to be incorrect protein sequences.

For the proteins that map to existing HOGs, an additional characterization is provided:

Partial mappings are proteins from which less than 80% of the sequence overlaps with their target root HOG, that is, at least 20% of the sequence at the extremity of the protein has no k -mer in common with the root HOG.

Fragments are sequences that are not partial hits but whose length is <50% of the median length of sequences in the HOG it was placed to.

Acquisition of proteome data

Reference proteomes were downloaded from UniProtKB 8 on 10 February 2022 (version April 2021). Assembly data for this dataset were kindly provided by the UniProt Reference Proteome team and are available in Supplementary Table 3 . Ensembl Metazoa proteomes were downloaded from their ftp website from version 52, 53 or 54 of the database (version number is reported in Supplementary Table 5 ). Data for the comparison of Ensembl and NCBI annotations were downloaded separately for each database. Ensembl proteomes were downloaded from Ensembl FTP for version 110 of the Main Ensembl website and version 57 of Ensembl Plants, Ensembl Metazoa, Ensembl Fungi and Ensembl Protists. NCBI proteomes were downloaded via the NCBI Datasets python API in August 2023, requesting genomes with annotation and downloading GFF and proteins files. Isoform files were generated for the Ensembl proteomes using the gene information in FASTA header: NCBI isoforms files were created using the gene and protein information in the corresponding GFF files.

Generation of simulated proteomes

Two datasets were used to assess the effect of introducing errors into proteomes on OMArk quality scores. These two datasets of real proteomes were used as the basis of the simulation: model proteomes and taxonomically representative proteomes. Model proteomes correspond to model eukaryotic species whose proteomes are assumed to be of high quality. Representative proteomes were selected under several criteria: they represent the major eukaryotic taxonomic divisions (two of each when possible), they must not be present nor have species of the genus represented in the OMA database (to avoid circularity) and they must have the best score possible for the aspect of OMArk quality measures (mainly, few missing genes and a higher proportion of consistency from other species of their division). Both lists of species are available in Supplementary Table 1 .

These source proteomes were manipulated in six ways, each simulating a case of spurious annotation:

Missing genes. For each proteome, only a fraction of the proteins were kept at random. This was repeated independently ten times with different proportions of the proteome kept, from 10% to 90% by increments of 10%.

Erroneous sequences . Errors in gene annotation were simulated from randomly generated nucleic sequences, from an equiprobable distribution of each base (25% of chance to draw A, T, G and C). The sequences were generated by increments of three, representing codons in the open reading frame, until a stop codon appeared. The resulting sequences were then translated into proteins and kept if their length was more than 20 amino acids. These sequences, independently generated in each simulation, were then added to each proteome proportionally to the original number of proteins in it, from 10% to 90% by increments of 10%.

Amino acid distribution-aware erroneous sequences . More realistic erroneous sequences were generated by computing the empirical distribution of amino acid in all proteins of the source proteome (treating stop codons as an additional amino acid) then sampling random characters from this distribution. This generated proteins with similar amino acid distribution and average length as the proteins in the target proteomes. These sequences, independently generated in each simulation, were then added to each proteome proportionally to the original number of proteins in it, from 10% to 90% by increments of 10%.

Fragmented sequences . We simulated fragments in the proteome by selecting random proteins in it, and removing part of the sequences randomly from between 10% and 90% of its length. The part was removed from either the C-terminal or N-terminal end, randomly with equal probability of each. This was repeated with different sequences until a target proportion of the proteome size. This process was done independently ten times for each proteome, for 10% to 90% of the proteome, by increments of 10%.

Fused sequences . We simulated fused protein sequences by randomly selecting pairs of proteins in the proteome and appending them to one another: Before the simulated fusion, we removed between 0% to 20% of one protein at the C-terminal end and 0% to 20% of the other at the N-terminal end. The merged protein was added to the proteome while the two original proteomes were removed. This process was repeated until a target proportion of the proteome size was reached. This step was done independently ten times for each proteome, for 10% to 90% of the proteome, by increments of 10%.

Contamination . A list of eukaryotic and bacterial proteomes, either from common contaminants in genomic data or microscopic species from a variety of clades were selected as contaminant proteomes. Then, a fixed number of proteins (10, 20, 50, 100, 200, 500 and 1,000) were drawn randomly without replacement independently from the contaminant proteomes and added to the complete source proteomes.

BUSCO comparisons

BUSCO 3 v.5.2.2 was run on UniProt Reference Proteomes and simulated data, using the odb10 version of the BUSCO dataset of the most specific lineage possible covering the target proteome and with default options. The corresponding dataset is available in Supplementary Table 3 . The summarized result for the BUSCO run was then extracted from the summary file.

Avian proteome fragment analysis

To compare fragmented sets between all avian UniProt Reference Proteomes, we first queried the OMArk results for the proteins classified as lineage consistent fragments and then used the OMAmer placement file to obtain the gene families to which they were associated. To avoid biasing the comparisons in the presence of duplication, we associated each gene name to their whole gene family identifier (root HOG) rather than to their subfamilies. The overlap between fragmented gene sets between two species was computed by directly comparing sets of their associated gene families using the following formula, for two sets A and B : \(\frac{\vert A \cap B \vert}{\min(\vert A \vert,\vert B \vert)}\) . The denominator was chosen to be the cardinality of the smallest set in order to not underestimate overlap in the smallest sets.

The zebra finch proteome was downloaded from the Ensembl archive for version 85 of the database. Overlap with the zebra finch taxonomically consistent fragment set was done as above but using the cardinality of the target proteome’s fragmented gene family set as the denominator.

Comparisons of NCBI and Ensembl proteomes

OMAmer and OMArk were run using FASTA and isoform files as input, along with the NCBI taxid of the proteome. We ran BUSCO with a FASTA file and the odb10 version of the BUSCO dataset, selecting the most specific lineage possible that covers the target proteome, and applied default parameters. Proteomes were matched to an assembly using either metadata downloaded from NCBI or downloaded from a species information file available on the Ensembl website. For NCBI and Ensembl proteomes with matching assemblies, we made pairwise comparisons for each OMArk value in the completeness and consistency assessment. When a species had an available proteome in OMA sourced from Ensembl, as per the OMA November 2022 release information, we marked these accordingly to assess bias in OMArk results.

Runtimes of BUSCO and OMArk

BUSCO v5.2.2 and the OMArk pipeline (OMAmer v2.0.0 and OMArk v0.3.0) were run on the 1,805 UniProt Reference proteomes using a Snakemake pipeline 21 and a Slurm scheduler. BUSCO and OMAmer were run with default parameters. We launched OMArk with a taxonomic identifier and an optional FASTA file. We ran BUSCO in offline mode with the required lineage folders available locally. All software was configured to run in serial mode, using only one thread. We obtained the job performance, including CPU time for each software on each proteome, from the Slurm scheduler efficiency report with the ‘seff’ command on each Slurm job. All computation was performed on the UNIL high-performance computer Curnagl, a 96-node cluster based on AMD Zen2/3 CPUs. 15 GB memory was requested for OMAmer; 10 GB for OMArk and 25 GB for BUSCO, which was enough to avoid any out-of-memory errors. All data were read and written on an 150 TB SSD-based scratch system.

Additional analysis

The additional analyses were performed in Python (v. 3.9.5) within a Jupyter Notebook. Plots were created using the matplotlib (version 3.4.2) 22 and the Seaborn 23 (v0.11.2) libraries. Notebooks used for this paper are available in the associated Zenodo archive. The Notebook “Human_missing_genes.ipynb” is for investigating human genes deemed as missing and the Notebook “blobtoolkit_contamination_check.ipynb” is for validating OMArk contamination results with BlobToolkit. Other companion Notebooks are available on OMArk GitHub repository, in the utils folder. “Contextualize_OMA.ipynb” allows investigation of OMArk’s missing and fragmented genes using OMA public data (sequences and synteny) and provides instructions to perform assembly completeness assessment. The Notebook “Explore_Data.ipynb” allows visualization of many OMArk results at once.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

UniProt Reference proteomes were downloaded from UniProtKB on 1 February 2022 (version 04/2021) through their ftp server. Ensembl Metazoa proteomes were downloaded from their ftp website from version 52, 53 and 54. NCBI proteomes were downloaded in August 2023 through the NCBI datasets python library, and proteomes from Ensembl 110 and Ensembl databases 57 were downloaded through their respective ftp websites. All datasets used and generated during the study and Supplementary Table files are made available through Zenodo ( https://doi.org/10.5281/zenodo.10034236 ) 24 . Precomputed results for UniProt, GenBank and Ensembl are made available through the OMArk web server ( https://omark.omabrowser.org ). Source data are provided with this paper.

Code availability

OMArk is available on GitHub ( https://github.com/DessimozLab/OMArk ) and as a python package on PyPI and Anaconda. OMArk version 0.3.0 and OMAmer version 2.0.0 that were used for all analyses are available on Zenodo ( https://doi.org/10.5281/zenodo.10474466 ) 25 .

Blaxter, M. et al. Why sequence all eukaryotes? Proc. Natl. Acad. Sci. USA 119 , e2115636118 (2022).

Article   PubMed   PubMed Central   Google Scholar  

Lawniczak, M. K. N. et al. Standards recommendations for the Earth BioGenome Project. Proc. Natl. Acad. Sci. USA 119 , e2115639118 (2022).

Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38 , 4647–4654 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31 , 3210–3212 (2015).

Article   PubMed   Google Scholar  

Saary, P., Mitchell, A. L. & Finn, R. D. Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC. Genome Biol. 21 , 244 (2020).

Kemena, C., Dohmen, E. & Bornberg-Bauer, E. DOGMA: a web server for proteome and transcriptome quality assessment. Nucleic Acids Res. 47 , W507–W510 (2019).

Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25 , 1043–1055 (2015).

UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49 , D480–D489 (2021).

Article   Google Scholar  

Nevers, Y., Glover, N. M., Dessimoz, C. & Lecompte, O. Protein length distribution is remarkably uniform across the tree of life. Genome Biol. 24 , 135 (2023).

Altenhoff, A. M. et al. OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more. Nucleic Acids Res. 49 , D373–D379 (2021).

Article   CAS   PubMed   Google Scholar  

Rossier, V. et al. OMAmer: tree-driven and alignment-free protein assignment to subfamilies outperforms closest sequence approaches. Bioinformatics 37 , 2866–2873 (2021).

Altenhoff, A. M., Gil, M., Gonnet, G. H. & Dessimoz, C. Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS One 8 , e53786 (2013).

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Altenhoff, A. M. et al. OMA orthology in 2024: improved prokaryote coverage, ancestral and extant GO enrichment, a revamped synteny viewer and more in the OMA Ecosystem. Nucleic Acids Res. 52 , D513–D521 (2024).

Kim, Y.-M. et al. Genome analysis of Hibiscus syriacus provides insights of polyploidization and indeterminate flowering in woody plants. DNA Res. 24 , 71–80 (2017).

CAS   PubMed   Google Scholar  

Feng, S. et al. Dense sampling of bird diversity increases power of comparative genomics. Nature 587 , 252–257 (2020).

Cunningham, F. et al. Ensembl 2022. Nucleic Acids Res. 50 , D988–D995 (2022).

Yates, A. D. et al. Ensembl Genomes 2022: an expanding genome resource for non-vertebrates. Nucleic Acids Res. 50 , D996–D1003 (2022).

Cornet, L. & Baurain, D. Contamination detection in genomic data: more is not enough. Genome Biol. 23 , 60 (2022).

Challis, R., Richards, E., Rajan, J., Cochrane, G. & Blaxter, M. BlobToolKit - Interactive Quality Assessment of Genome Assemblies. G3 (Bethesda) 10 , 1361–1374 (2020).

Schoch, C. L. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford) 2020 , baaa062 (2020).

Mölder, F. et al. Sustainable data analysis with Snakemake. F1000Res. 10 , 33 (2021).

Hunter, J. D. Matplotlib: a 2D graphics environment. J. Comput. Sci. Eng. 9 , 90–95 (2007).

Waskom, M. seaborn: statistical data visualization. J. Open Source Softw. 6 , 3021 (2021).

Article   ADS   Google Scholar  

Nevers Y, et al . Multifaceted quality assessment of gene repertoire annotation with OMArk [datasets]. Zenodo https://doi.org/10.5281/zenodo.10034236 (2022).

Nevers Y, Warwick Vesztrocy A, Altenhoff AM. OMArk version 0.3.0 [computer code]. Zenodo https://doi.org/10.5281/zenodo.10474466 (2024).

Download references

Acknowledgements

We thank R. Waterhouse and M. Blaxter for their helpful feedback and comments during the development process. We thank D. Jyothi for his help and for providing metadata for the UniProt Reference Proteomes dataset. Finally, we thank the three anonymous reviewers for their insightful feedback and constructive suggestions. Y.N., A.W.V., V.R. and C.D. were supported by the Swiss National Science Foundation (grants 183723 and 205085).

Open access funding provided by University of Lausanne

Author information

Authors and affiliations.

Department of Computational Biology, University of Lausanne, Lausanne, Switzerland

Yannis Nevers, Alex Warwick Vesztrocy, Victor Rossier, Clément-Marie Train, Christophe Dessimoz & Natasha M. Glover

Swiss Institute of Bioinformatics, Lausanne, Switzerland

Yannis Nevers, Alex Warwick Vesztrocy, Victor Rossier, Adrian Altenhoff, Christophe Dessimoz & Natasha M. Glover

Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland

Victor Rossier

Department of Computer Science, ETH Zurich, Zurich, Switzerland

Adrian Altenhoff

You can also search for this author in PubMed   Google Scholar

Contributions

Y.N. designed, developed and tested the OMArk software. Y.N., C.D. and N.G. designed the experiments. Y.N. and N.G. carried out the analyses. A.A. and C.T. designed and developed the OMArk web server. V.R. and A.W.V. developed the OMAmer software and adapted it to the needs of the OMArk pipeline. Y.N., C.D. and N.G. wrote and edited the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yannis Nevers .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Biotechnology thanks Arang Rhie and the other, anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary Analyses. Supplementary Tables 7–9 and Supplementary Figures 1–56.

Reporting Summary

Supplementary tables.

Supplementary Tables 1–6.

Source Data Figs. 1–6

Table 1. Source data Fig. 1. OMArk results for Danio rerio . Table 2. Source Data Fig. 3. OMArk results for simulation of six species (full results in Supplementary Table 1. Table 3. Source Data Fig. 4. OMArk results for 1,805 UniProt species (full results in Supplementary Table 3). Table 4. Source Data Fig. 5. OMArk and BUSCO results and fragments proportion (full results in Supplementary Table 3). Table 5. Source Data Fig. 6.OMArk results for Myomorpha UniProt proteomes and for Bombus impatiens Ensembl Metazoa proteomes (full results in Supplementary Tables 3 and 5).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Nevers, Y., Warwick Vesztrocy, A., Rossier, V. et al. Quality assessment of gene repertoire annotations with OMArk. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-024-02147-w

Download citation

Received : 16 December 2022

Accepted : 17 January 2024

Published : 21 February 2024

DOI : https://doi.org/10.1038/s41587-024-02147-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

annotations good practice

Attention English majors: now you can add handwritten notes to Google Docs

But only if you have an android device with a touchscreen..

By Joanna Nelius , laptop reviewer. She has covered consumer technology, with an emphasis on PC gaming, since 2018. Previous bylines: USA Today, Gizmodo, PC Gamer, Maximum PC, among others.

Share this story

A screenshot of a text document with colored handwriting in the margins.

For anyone who has ever wished for the ability to hand annotate directly on a Google Doc, Google announced on Tuesday it’s rolling out a new markup feature for Google Workspace customers, Google Workspace Individual subscribers, and personal Google accounts that will allow users to write directly on a Google Doc with a stylus or their finger. The new feature includes a few standard pen and highlighter colors (black, blue, red, green, yellow), and an eraser. If you don’t like any of those colors, you can add your own.

There are so many use cases for a feature like this, across age groups, industries, and professional and personal work. Google calls out some good ones in its announcement, especially for “educators giving students feedback on their essays, reports, short stories.” Anecdotally, I cannot understate how useful this feature could be for creative writing students and professors, specifically.

It melds the old-school way of handing out hard copies of your work with current computing technologies and classroom management platforms that can integrate with Google Drive, like Canvas or Blackboard. But you’ll never have to worry about lugging a massive stack of paper to class on your workshop day — or running out of print credits at the library. (Flash back to undergrad, circa 2006, when my alma mater started charging students for printing coursework at the library.)

A screenshot of a text document with colored handwriting in the margins.

There’s also the tactile aspect of this new feature; a lot of us creative writers prefer to handwrite development notes on our peers’ work because not only is it more personal, but because Google Docs’ system for tracking editing and comments can quickly clutter the page. The contrast of typed text and handwritten notes on the same page can make the information easier for the writer receiving feedback to parse. Especially if you write in a fun color. (Shout out to Allison and her baby blue gel pens for making constructive criticism fun.)

But there’s a glaring issue with the beginning of its rollout: hand annotation is only available on Android devices. Windows, macOS, iOS, ChromeOS, and even Chrome Browser users on any of those devices only have the option of viewing documents with markups, and showing, hiding, and deleting them. So that shortens the list of compatible devices to Android phones and tablets. Many K-12 students use non-touchscreen clamshell Chromebooks provided by the school for writing and feedback assignments, and the older the student, the more likely they will have a Windows or macOS clamshell laptop.

There’s another issue: in the last 10 years, I can count the number of times I’ve seen one of my students or grad school colleagues writing on a tablet or 2-in-1 on half a hand. So while it seems like Google has created a great feature that can be used by educators and students in and out of the classroom, most won’t have compatible devices.

If Google opened up the feature to include Windows, macOS, and iOS devices, that could help alleviate the issue. Apple accounts for almost 55.9 percent of the tablet market, according to an October 2023 report published by Statista , and while the 2-in-1 laptop market is constantly growing, some colleges encourage their students to purchase convertible Windows laptops , depending on their major.

Google’s rollout of its new feature started February 27th for users enrolled in Rapid Release. Users under Google Standard release will start seeing the new feature on March 11th. Each rollout will take about 15 days.

If you don’t know your release track, from the Admin console go to Menu > Account > Account settings > Preferences > Release preferences > New features to check.

Google CEO says Gemini AI diversity errors are ‘completely unacceptable’

What a bunch of a-list celebs taught me about how to use my phone, nintendo sues switch emulator yuzu for ‘facilitating piracy at a colossal scale’, the case for 4k blu-ray in a world of streaming, eufy’s new 360-degree 4k camera doesn’t need wi-fi or power outlets.

Sponsor logo

More from Google

Red background with a repeating pattern of white headphones.

The best alternatives to Spotify for listening to music

Hand holding Android phone against illustrated background

Android 101: how to free up space on your phone

Celebrity Sightings In Los Angeles - October 07, 2021

IMAGES

  1. 10 Best Ways to Annotate an Article: Examples & Tips 2024

    annotations good practice

  2. Simple Guide to Annotation

    annotations good practice

  3. Using Strategies for Writing College Essays

    annotations good practice

  4. 3: Good Practice Examples

    annotations good practice

  5. Annotation Worksheet by Living the Lit life

    annotations good practice

  6. How To Annotate A Poem

    annotations good practice

VIDEO

  1. 09 annotate

  2. 6- Screening by reading Title and Abstract

  3. Annotations are a great way to add context and share measurements with others

  4. TestNG annotations along with example methods|| Selenium Java

  5. how to save annotations on series at workstation adw 4.7

  6. Lesson 3 on annotating projects

COMMENTS

  1. Annotating Texts

    A common concern about annotating texts: It takes time! Yes, it can, but that time isn't lost—it's invested. Spending the time to annotate on the front end does two important things: It saves you time later when you're studying. Your annotated notes will help speed up exam prep, because you can review critical concepts quickly and ...

  2. How to Annotate Texts

    This resource is a good place to start for a student who has never had to take notes on film before. It briefly outlines three general approaches to note-taking on a film. ... Annotation Practice Worksheet (La Guardia Community College) This worksheet has a sample text and instructions for students to annotate it. It is a useful resource for ...

  3. Strategies for Teaching How To Annotate

    Annotation Activity: Create a dice game where students have to find concepts and annotate them based on the number they roll. For example, 1 = Circle and define a word you don't know, 2 = Underline a main character, 3 = Highlight the setting, etc. Teaching students how to annotate gives them an invaluable tool for actively engaging with a text.

  4. Research Guides: Reading and Study Strategies: Annotating a Text

    Annotating is any action that deliberately interacts with a text to enhance the reader's understanding of, recall of, and reaction to the text. Sometimes called "close reading," annotating usually involves highlighting or underlining key pieces of text and making notes in the margins of the text. This page will introduce you to several ...

  5. Developing Linguistic Corpora: a Guide to Good Practice

    Corpus annotation is the practice of adding interpretative linguistic information to a corpus. For example, one common type of annotation is the addition of tags, or labels, indicating the word class to which words in a text belong. This is so-called part-of-speech tagging (or POS tagging), and can be useful, for example, in distinguishing ...

  6. Annotating a Text

    Annotating a text, or marking the pages with notes, is an excellent, if not essential, way to make the most out of the reading you do for college courses. Annotations make it easy to find important information quickly when you look back and review a text. They help you familiarize yourself with both the content and organization of what you read ...

  7. How to annotate: 5 strategies for success

    Pick one color and use it throughout the text, or assign specific colors to specific points. For instance, yellow for key points, green for supporting information, red for questions, etc. Being consistent will ensure you can understand your annotations when you review them later. Include a key or legend.

  8. How Students and Teachers Benefit From Students Annotating Their Own

    The Benefits of Annotation for Students and for Teachers. For students, the potential positives of unpacking and explaining their own writing were instantly apparent and significant. These are ...

  9. Annotations

    Definition and Purpose. Annotating literally means taking notes within the text as you read. As you annotate, you may combine a number of reading strategies—predicting, questioning, dealing with patterns and main ideas, analyzing information—as you physically respond to a text by recording your thoughts. Annotating may occur on a first or ...

  10. Annotating text: The complete guide to close reading

    Learning to effectively annotate text is a powerful tool that can improve your reading, self-learning, and study strategies. Using an annotating system that includes text annotations and note-taking during close reading helps you actively engage with the text, leading to a deeper understanding of the material.

  11. Creative Annotation Can Improve Students' Reading ...

    Illustrated annotations use images to increase comprehension and understanding. Students create illustrations to represent concepts and elements of literature. Prior to reading the text, the students create a visual representation or symbol for the concept or element of focus for the learning target. When the students annotate the text, they ...

  12. About Annotation (and an opportunity to practice)

    1 About Annotation (and an opportunity to practice) Marginalia flickr photo by Cat Sidh shared under a Creative Commons (BY-NC-ND) license. Annotation, the act of adding additional information as a note attached to a specific part of a published work (or simply highlighting key passages), is a familiar academic but also everyday practice.. As described in Remi Kalir and Antero Garcia's book ...

  13. Annotating evidence of professional learning

    An annotation is a statement that provides context for your evidence of professional learning and explains its significance. It is a story of your professional knowledge, practice, and engagement. It could be in the form of notations on an artefact (an individual piece of evidence, e.g. a lesson plan, piece of professional reading, or meeting ...

  14. What Is an Annotated Bibliography?

    Published on March 9, 2021 by Jack Caulfield . Revised on August 23, 2022. An annotated bibliography is a list of source references that includes a short descriptive text (an annotation) for each source. It may be assigned as part of the research process for a paper, or as an individual assignment to gather and read relevant sources on a topic.

  15. Data Annotation in 2024: Why it matters & Top 8 Best Practices

    5. Video annotation. Video annotation is the process of teaching computers to recognize objects from videos. Image and video annotation are types of data annotation methods that are performed to train computer vision (CV) systems, which is a subfield of artificial intelligence (AI).

  16. Teaching Student Annotation: Constructing Meaning Through Connections

    Overview. Students learn about the purposes and techniques of annotation by examining text closely and critically. They study sample annotations and identify the purposes annotation can serve. Students then practice annotation through a careful reading of a story excerpt, using specific guidelines and writing as many annotations as possible.

  17. Annotate

    The practice of annotating is beneficial for different reasons, depending on the type of text one is reading. ... and is a good option when the reader wants more organized annotations and does not ...

  18. Why & How To Annotate A Book

    Tips for Effective Book Annotation. Now you know why you should annotate your books, here are my tips for how to annotate your books for fun. 1. Use a pencil . Now I really do sound like a teacher talking! It's always a good idea to use a pencil instead of a pen so that you can erase your annotations later or adjust them while you're reading.

  19. Annotating Text Strategies That Enhance Close Reading [Free ...

    Benefits of Annotating a Text. The benefits of annotation include: Keeping track of key ideas and questions. Helping formulate thoughts and questions for deeper understanding. Fostering analyzing and interpreting texts. Encouraging the reader to make inferences and draw conclusions about the text. Allowing the reader to easily refer back to the ...

  20. Pros and Cons of Type Hints

    Annotations were introduced in Python 3.0, and it's possible to use type comments in Python 2.7. Still, improvements like variable annotations and postponed evaluation of type hints mean that you'll have a better experience doing type checks using Python 3.6 or even Python 3.7. Type hints introduce a slight penalty in start-up time.

  21. How to annotate a sketchbook: a guide for art students

    It is good practice for students to get in the habit of clearly crediting all work from others. This is an excerpt of an Edexcel GCSE Art and Design sketchbook by Justine Ho, West Island School. Underneath the image by Jim Dine, Justine has clearly written the name, date, and source of the image. Justine was awarded A* (100%) for GCSE Art.

  22. 17 Awesome Annotation Activities

    Having them read others' work is great practice and symbols make great annotation tools! 4. Annotate Books. Before you can annotate a book, it's important to read it actively. Meaning, engaging with the text, taking notes, and highlighting key points. This is key when teaching students about annotation.

  23. Annotating Text in 3rd, 4th, and 5th Grade

    Annotating text will seem tedious to 3rd, 4th, and 5th grade students as they first begin, but soon you will see the quality and the amount that they are writing in those margins increase. Model annotating text first to ease the students' minds. Then, annotate together before you let them try it on their own. This gradual release will have ...

  24. Quality assessment of gene repertoire annotations with OMArk

    Annotations for these genomes were performed using, among other sources of evidence, homology from the Ensembl 85 (ref. 16) annotation of zebra finch 15 (Taeniopygia guttata; taeGut3.2.4 assembly ...

  25. Google adds hand-annotation markup feature to Docs

    By Joanna Nelius, laptop reviewer. She has covered consumer technology, with an emphasis on PC gaming, since 2018. Previous bylines: USA Today, Gizmodo, PC Gamer, Maximum PC, among others.