I want to create a Python script that will parse 40.000 PDF files(text and images). Since I saw that there is no easy method to check if a page contains images I think I should use textract module.
Ideally I would deploy to Google App Engine.
My question is, for textract I've also installed other packages beside Python to my system. Can I deploy the script(with proper requirements.txt file) on Google Cloud App Engine without problem? or I will to use something else?
解决方案
It is possible to use App Engine, but only with the Flexible environment and using a custom runtime, which allows you to add non-python dependencies (and also python dependencies not installable via pip):
Custom runtimes allow you to define new runtime environments, which
might include additional components like language interpreters or
application servers.