PDFtoTEXT
をテンプレートにして作成
[
トップ
] [
新規
|
一覧
|
検索
|
最終更新
|
ヘルプ
|
ログイン
]
開始行:
[[DTP関連]]
*テキスト埋め込みのPDFからテキストを抜き取る [#u3996c31]
ターミナル作業になります(MacOSX Sierra)
--------------------------------------------------------...
PDFtoTEXT
Usage: pdftotext [options] <PDF-file> [<text-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-r <fp> : resolution, in DPI (default is 72)
-x <int> : x-coordinate of the crop area to...
-y <int> : y-coordinate of the crop area to...
-W <int> : width of crop area in pixels (de...
-H <int> : height of crop area in pixels (d...
-layout : maintain original physical layout
-fixed <fp> : assume fixed-pitch (or tabular) ...
-raw : keep strings in content stream o...
-htmlmeta : generate a simple HTML file, inc...
-enc <string> : output text encoding name
-listenc : list available encodings
-eol <string> : output end-of-line convention (u...
-nopgbrk : don't insert page breaks between...
-bbox : output bounding box for each wor...
-bbox-layout : like -bbox but with extra layout...
-opw <string> : owner password (for encrypted fi...
-upw <string> : user password (for encrypted fil...
-q : don't print any messages or errors
-v : print copyright and version info
-h : print usage information
-help : print usage information
--help : print usage information
-? : print usage information
最初の1ページをテキスト化する
PDFtoTEXT -raw -f 1 -l 1 inf.pdf out
inf:変換元PDF
out:出力先 省略するとinf.txtとなる
終了行:
[[DTP関連]]
*テキスト埋め込みのPDFからテキストを抜き取る [#u3996c31]
ターミナル作業になります(MacOSX Sierra)
--------------------------------------------------------...
PDFtoTEXT
Usage: pdftotext [options] <PDF-file> [<text-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-r <fp> : resolution, in DPI (default is 72)
-x <int> : x-coordinate of the crop area to...
-y <int> : y-coordinate of the crop area to...
-W <int> : width of crop area in pixels (de...
-H <int> : height of crop area in pixels (d...
-layout : maintain original physical layout
-fixed <fp> : assume fixed-pitch (or tabular) ...
-raw : keep strings in content stream o...
-htmlmeta : generate a simple HTML file, inc...
-enc <string> : output text encoding name
-listenc : list available encodings
-eol <string> : output end-of-line convention (u...
-nopgbrk : don't insert page breaks between...
-bbox : output bounding box for each wor...
-bbox-layout : like -bbox but with extra layout...
-opw <string> : owner password (for encrypted fi...
-upw <string> : user password (for encrypted fil...
-q : don't print any messages or errors
-v : print copyright and version info
-h : print usage information
-help : print usage information
--help : print usage information
-? : print usage information
最初の1ページをテキスト化する
PDFtoTEXT -raw -f 1 -l 1 inf.pdf out
inf:変換元PDF
out:出力先 省略するとinf.txtとなる
ページ名: