It is very convenient to separate book excerpts with regard to the chapter.
book_toc = get_table_of_contents_from_epub(TEST_BOOKS_DIR + TEST_EPUB_HOWTO)
self.assertIsNotNone(book_toc)
for elem in book_toc:
print(elem.href)
print(elem.uid)
print(elem.title)
Table of contents is retrievable as such.
Some tests that are already passing:
# 24/02/01 - realised chapter information about each book would be nice.
class TestingBookTableOfContents(unittest.TestCase):
def test_can_get_table_of_contents_for_a_given_book(self):
book_toc = get_table_of_contents_from_epub(TEST_BOOKS_DIR + TEST_EPUB_HOWTO)
self.assertIsNotNone(book_toc)
def test_can_match_highlight_with_a_chapter(self):
HIGHLIGHT_ID = "871ca40f-3d58-4fff-be19-400e700bdad6"
title, _, highlight, _, section, path = (x := get_highlight_from_database(HIGHLIGHT_ID))
book_toc = get_table_of_contents_from_epub(TEST_BOOKS_DIR + TEST_EPUB_HOWTO)
self.assertEqual(match_highlight_section_to_chapter(section, book_toc), 'Introduction: Surviving Usefulness')
This only works for one book so far; would be nice to expand to more books. Also, to deduce chapter order.
Progress has been made, and chapters are already working for some books. But it seems there is no wide table of contents standard in .epub
files.
For The Shallows, for example, not all chapters are matched. I'll devise some tests for this case.
This is fixed and tests are done; there are still other books with issues in the chapter indexing, and I feel like I should be trying to mimic the Kobo's behaviour instead — no book has problems in it.
Status | done |
---|---|
Priority | medium |