Amazon Textract
With Amazon Textract, you can extract text from assets based on the content types Document, Spreadsheet, Presentation, and Attachment. Brightspot associates the extracted text with the asset, and editors can then search for and use your asset in their own content.
Note
The Amazon Textract integration is currently not available for image files you add to Brightspot.
This section describes how to configure the Amazon Textract integration in Brightspot, and how to view extracted text.
Including Amazon Textract in a Brightspot build
The following table lists the dependencies to include in your build configuration.
Artifact | Description |
com.psddev:aws-textract | Exposes Textract-related controls in Sites & Settings, as well as the UI and processing to submit and display results of Textract jobs. |
Runtime prerequisites
- Developer configuration—Extend the
TextractPostProcessor
class to process what Textract returns. - Ops configuration—Textract requires a queue, topic, and role ARNs to make API calls. The role must have permission to call Textract (see How Amazon Textract Works with IAM - Amazon Textract). The topic is used by Textract to notify of completion, and the queue is used to post completion status (see Configuring Amazon Textract for Asynchronous Operations - Amazon Textract).
- CMS configuration—Configure the site interfacing with Amazon Textract. For details, see Configuring the Amazon Textract integration.
See also:
Previous Topic
Applying suggested tags to images
Next Topic
Configuring the Amazon Textract integration