diff options
Diffstat (limited to 'python/python3-pdfminer.six')
-rw-r--r-- | python/python3-pdfminer.six/README | 36 | ||||
-rw-r--r-- | python/python3-pdfminer.six/python3-pdfminer.six.SlackBuild | 81 | ||||
-rw-r--r-- | python/python3-pdfminer.six/python3-pdfminer.six.info | 10 | ||||
-rw-r--r-- | python/python3-pdfminer.six/slack-desc | 19 |
4 files changed, 146 insertions, 0 deletions
diff --git a/python/python3-pdfminer.six/README b/python/python3-pdfminer.six/README new file mode 100644 index 0000000000..0f9bb3a96d --- /dev/null +++ b/python/python3-pdfminer.six/README @@ -0,0 +1,36 @@ +Pdfminer.six is a tool for extracting information from PDF documents. It +focuses on getting and analyzing text data. Pdfminer.six extracts the +text from a page directly from the sourcecode of the PDF. It can also be +used to get the exact location, font or color of the text. + +It is built in a modular way such that each component of pdfminer.six +can be replaced easily. You can implement your own interpreter or +rendering device that uses the power of pdfminer.six for other purposes +than text analysis. + +Features: + +* Written entirely in Python. +* Parse, analyze, and convert PDF documents. +* Extract content as text, images, html or hOCR. +* PDF-1.7 specification support. (well, almost). +* CJK languages and vertical writing scripts support. +* Various font types (Type1, TrueType, Type3, and CID) support. +* Support for extracting images (JPG, JBIG2, Bitmaps). +* Support for various compressions (ASCIIHexDecode, ASCII85Decode, + LZWDecode, FlateDecode, RunLengthDecode, CCITTFaxDecode) +* Support for RC4 and AES encryption. +* Support for AcroForm interactive form extraction. +* Table of contents extraction. +* Tagged contents extraction. +* Automatic layout analysis. + +Pdfminer.six comes with two handy tools: pdf2txt.py and dumppdf.py. + +The pdf2txt.py tool extracts all the text from a PDF. It uses layout +analysis with sensible defaults to order and group the text in a +sensible way. + +The dumppdf.py tool can be used to extract the internal structure from a +PDF. This tool is primarily for debugging purposes, but that can be +useful to anybody working with PDF’s. diff --git a/python/python3-pdfminer.six/python3-pdfminer.six.SlackBuild b/python/python3-pdfminer.six/python3-pdfminer.six.SlackBuild new file mode 100644 index 0000000000..27148d7723 --- /dev/null +++ b/python/python3-pdfminer.six/python3-pdfminer.six.SlackBuild @@ -0,0 +1,81 @@ +#!/bin/bash + +# Slackware build script for python3-pdfminer.six + +# Copyright 2023-2024, Alexander Verbovetsky, Moscow, Russia +# Copyright 2015-2016 Brenton Earl <brent@exitstatusone.com> +# All rights reserved. +# +# Redistribution and use of this script, with or without modification, is +# permitted provided that the following conditions are met: +# +# 1. Redistributions of this script must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED +# WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO +# EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; +# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR +# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +cd $(dirname $0) ; CWD=$(pwd) + +PRGNAM=python3-pdfminer.six +VERSION=${VERSION:-20240706} +BUILD=${BUILD:-1} +TAG=${TAG:-_SBo} +PKGTYPE=${PKGTYPE:-tgz} + +if [ -z "$ARCH" ]; then + case "$( uname -m )" in + i?86) ARCH=i586 ;; + arm*) ARCH=arm ;; + *) ARCH=$( uname -m ) ;; + esac +fi + +if [ ! -z "${PRINT_PACKAGE_NAME}" ]; then + echo "$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE" + exit 0 +fi + +TMP=${TMP:-/tmp/SBo} +PKG=$TMP/package-$PRGNAM +OUTPUT=${OUTPUT:-/tmp} + +set -e + +rm -rf $PKG +mkdir -p $TMP $PKG $OUTPUT +cd $TMP +rm -rf ${PRGNAM:8}-$VERSION +tar xvf $CWD/${PRGNAM:8}-$VERSION.tar.gz +cd ${PRGNAM:8}-$VERSION +chown -R root:root . +find -L . \ + \( -perm 777 -o -perm 775 -o -perm 750 -o -perm 711 -o -perm 555 \ + -o -perm 511 \) -exec chmod 755 {} \; -o \ + \( -perm 666 -o -perm 664 -o -perm 640 -o -perm 600 -o -perm 444 \ + -o -perm 440 -o -perm 400 \) -exec chmod 644 {} \; + +sed -i "s/__VERSION__/$VERSION/" pdfminer/__init__.py + +python3 setup.py install --root=$PKG + +find $PKG -print0 | xargs -0 file | grep -e "executable" -e "shared object" | grep ELF \ + | cut -f 1 -d : | xargs strip --strip-unneeded 2> /dev/null || true + +mkdir -p $PKG/usr/doc/$PRGNAM-$VERSION +cp -a *.md samples $PKG/usr/doc/$PRGNAM-$VERSION +cat $CWD/$PRGNAM.SlackBuild > $PKG/usr/doc/$PRGNAM-$VERSION/$PRGNAM.SlackBuild + +mkdir -p $PKG/install +cat $CWD/slack-desc > $PKG/install/slack-desc + +cd $PKG +/sbin/makepkg -l y -c n $OUTPUT/$PRGNAM-$VERSION-$ARCH-$BUILD$TAG.$PKGTYPE diff --git a/python/python3-pdfminer.six/python3-pdfminer.six.info b/python/python3-pdfminer.six/python3-pdfminer.six.info new file mode 100644 index 0000000000..10515977bc --- /dev/null +++ b/python/python3-pdfminer.six/python3-pdfminer.six.info @@ -0,0 +1,10 @@ +PRGNAM="python3-pdfminer.six" +VERSION="20240706" +HOMEPAGE="https://github.com/pdfminer/pdfminer.six" +DOWNLOAD="https://github.com/pdfminer/pdfminer.six/archive/20240706/pdfminer.six-20240706.tar.gz" +MD5SUM="7b6e98471239dde4bbdfb910b13ffa05" +DOWNLOAD_x86_64="" +MD5SUM_x86_64="" +REQUIRES="cryptography python3-setuptools-git-versioning" +MAINTAINER="Alexander Verbovetsky" +EMAIL="alik@ejik.org" diff --git a/python/python3-pdfminer.six/slack-desc b/python/python3-pdfminer.six/slack-desc new file mode 100644 index 0000000000..b996061944 --- /dev/null +++ b/python/python3-pdfminer.six/slack-desc @@ -0,0 +1,19 @@ +# HOW TO EDIT THIS FILE: +# The "handy ruler" below makes it easier to edit a package description. +# Line up the first '|' above the ':' following the base package name, and +# the '|' on the right side marks the last column you can put a character in. +# You must make exactly 11 lines for the formatting to be correct. It's also +# customary to leave one space after the ':' except on otherwise blank lines. + + |-----handy-ruler------------------------------------------------------| +python3-pdfminer.six: python3-pdfminer.six (PDF parser and analyzer) +python3-pdfminer.six: +python3-pdfminer.six: +python3-pdfminer.six: Pdfminer.six is a tool for extracting information from PDF documents. +python3-pdfminer.six: It focuses on getting and analyzing text data. Pdfminer.six extracts +python3-pdfminer.six: the text from a page directly from the sourcecode of the PDF. It can +python3-pdfminer.six: also be used to get the exact location, font or color of the text. +python3-pdfminer.six: +python3-pdfminer.six: Homepage: https://github.com/pdfminer/pdfminer.six +python3-pdfminer.six: +python3-pdfminer.six: |