Assigned
Status Update
Comments
ca...@google.com <ca...@google.com> #2
I have forwarded this request to the engineering team. We will update this issue with any progress updates and a resolution.
Best Regards,
Josh Moyer
Google Cloud Platform Support
Best Regards,
Josh Moyer
Google Cloud Platform Support
si...@gmail.com <si...@gmail.com> #3
This is not only useful for IP addresses, but also for many other resources. I understand that names are currently used as identifiers, so this request is probably not trivial to implement. Maybe distinguishing between a (numeric, automatically generated) identifier and a (textual) label is the way to go?
ca...@google.com <ca...@google.com> #4
Is it any hope? We have migrated our IP address to the server with different role, and now the name of this IP address resource doesn't match its role at all. It seems to be trivial enough to momentary reserve static IP address of the old named resource, drop resource, and immediately recreate it with the new name and the old IP address.
ca...@google.com <ca...@google.com> #5
This would also improve life when using the Google Deployment Manager (since it otherwise error's out if you've changed a name of an IP)
Description
I am using google cloud vision PHP API to extract text from pdf it's working but, the result is not proper.
Like missing some text and wrong content order.
Please suggest the solution, how to get the correct order, and formatted result.
I have attached the pdf for reference.
I have attached the output file for referance
Following is PHP code in use:
<?php
namespace Google\Cloud\Samples\Vision;
require 'vendor/autoload.php';
use Google\Cloud\Core\ServiceBuilder;
use Google\Cloud\Storage\StorageClient;
use Google\Cloud\Vision\V1\AnnotateFileResponse;
use Google\Cloud\Vision\V1\AsyncAnnotateFileRequest;
use Google\Cloud\Vision\V1\Feature;
use Google\Cloud\Vision\V1\Feature\Type;
use Google\Cloud\Vision\V1\GcsDestination;
use Google\Cloud\Vision\V1\GcsSource;
use Google\Cloud\Vision\V1\ImageAnnotatorClient;
use Google\Cloud\Vision\V1\InputConfig;
use Google\Cloud\Vision\V1\OutputConfig;
// Authenticate using a keyfile path
//$cloud = new ServiceBuilder([
// 'keyFilePath' => 'path/to/keyfile.json'
//]);
// Authenticate using keyfile data
$cloud = new ServiceBuilder([
'keyFile' => json_decode(file_get_contents('vision translator-72b1c4a02ca0.json'), true)
]);
$path = 'gs://path/test-217.pdf';
$output = 'gs://result/';
echo detect_pdf_gcs($path, $output);
function detect_pdf_gcs($path, $output)
{
# select ocr feature
$feature = (new Feature())
->setType(Type::DOCUMENT_TEXT_DETECTION);
# set $path (file to OCR) as source
$gcsSource = (new GcsSource())
->setUri($path);
# supported mime_types are: 'application/pdf' and 'image/tiff'
$mimeType = 'application/pdf';
$inputConfig = (new InputConfig())
->setGcsSource($gcsSource)
->setMimeType($mimeType);
# set $output as destination
$gcsDestination = (new GcsDestination())
->setUri($output);
# how many pages should be grouped into each json output file.
$batchSize = 2;
$outputConfig = (new OutputConfig())
->setGcsDestination($gcsDestination)
->setBatchSize($batchSize);
# prepare request using configs set above
$request = (new AsyncAnnotateFileRequest())
->setFeatures([$feature])
->setInputConfig($inputConfig)
->setOutputConfig($outputConfig);
$requests = [$request];
# make request
$imageAnnotator = new ImageAnnotatorClient();
$operation = $imageAnnotator->asyncBatchAnnotateFiles($requests);
print('Waiting for operation to finish.' . PHP_EOL);
$operation->pollUntilComplete();
# once the request has completed and the output has been
# written to GCS, we can list all the output files.
preg_match('/^gs:\/\/([a-zA-Z0-9\._\-]+)\/?(\S+)?$/', $output, $match);
$bucketName = $match[1];
$prefix = isset($match[2]) ? $match[2] : '';
$storage = new StorageClient();
$bucket = $storage->bucket($bucketName);
$options = ['prefix' => $prefix];
$objects = $bucket->objects($options);
# save first object for sample below
$objects->next();
$firstObject = $objects->current();
# list objects with the given prefix.
print('Output files:' . PHP_EOL);
foreach ($objects as $object) {
print($object->name() . PHP_EOL);
}
# process the first output file from GCS.
# since we specified batch_size=2, the first response contains
# the first two pages of the input file.
$jsonString = $firstObject->downloadAsString();
$firstBatch = new AnnotateFileResponse();
$firstBatch->mergeFromJsonString($jsonString);
# get annotation and print text
foreach ($firstBatch->getResponses() as $response) {
$annotation = $response->getFullTextAnnotation();
print($annotation->getText());
}
$imageAnnotator->close();
}
?>
Please suggest proper solution to get the out put in proper order with all content in json