Thursday, July 01, 2021

Images to PDF file (case study)

0. Previous requeriments

$sudo apt install img2pdf
$sudo apt install ocrmypdf

1. Download images by script

for n in `seq 1 23`
do
  sn=`printf "%02d" $n`
  echo $sn
  wget https://larepublica.cronosmedia.glr.pe/printed/2021/07/01/lima/pages/$sn.jpeg
done
 

2. Convert images to pdf (after executed download script by sh)

$img2pdf *.jpeg --output rep1.pdf #or

$convert *.jpeg rep2.pdf

$ocrmypdf rep1.pdf rep1_ocr.pdf #reduce size

 

Notes for convert command(change from none to read|write):

sudo vim /etc/ImageMagick-6/policy.xml

<policy domain="coder" rights="none" pattern="PDF" />

<policy domain="coder" rights="read|write" pattern="PDF" />
 
 
Bonus
[1] El Pueblo news url https://www.diarioelpueblo.com.pe/wp-content/uploads/2021/07/01-07-2021.pdf
 

No comments:

PostgreSQL json fields

select name->'es_PE',* from product_template where name->>'es_PE' like '%MEGACI%'   References: [1] https:/...