On this page
On this page
To E2E or Not
That is the question.
I've been working on a text recognition model at work, and the way it's currently set up, the model outputs something, which then is fed into a separate piece of logic/algorithm that then resolves the output to the text sequence itself. It's kind of like having a [[conditional-random-field]] at the end, but our post-processing sequence inference is much more involved.
On the other extreme you have e2e models (LSTMs, Transformers)
Edit this page
Last updated on 10/24/2022