Indexing office documents and Flash files

Yandex indexes HTML documents and files of the following types: PDF, DOC/DOCX, XLS/XLSX, PPT/PPTX (MS Office); ODS, ODP, ODT, and ODG (Open Office); RTF, TXT, and SWF (if a file is referenced directly or embedded in HTML code using object or embed). If an SWF file contains useful content, the original HTML document can be found by the content indexed in the SWF file.

When new software versions are released, support for the new formats may take a while.

Restrictions on the indexed data:

  • Documents larger than 10 MB aren't indexed.
  • If a PDF document contains only images, the first three pages are indexed. A PDF document that also contains text is indexed in full.

  • In Flash documents, the text from the following blocks is indexed:

    • DefineText.

    • DefineText2.

    • DefineEditText.

    • Metadata.

  • Links are indexed if they are in these blocks:

    • DoAction.

    • DefineButton.

    • DefineButton2.

Tell us what your question is about so we can direct you to the right specialist:

Excluding pages from the search results is not an error on the part of a site or the indexing robot: it excludes pages that users won't be able to find using search queries. Therefore, their exclusion shouldn't affect the visibility of indexed pages on the site.

Contact support if:

  • Pages were ranked high in the search results before they were excluded.
  • The site's position after the exclusion of pages decreased dramatically.
  • The number of click-throughs from the search engine reduced significantly after the pages were excluded.