Context Navigation

ooo.py @ 491

Revision 491, 2.3 kB (checked in by jerome, 16 years ago)
Major code cleaning. Now clearer, although probably a bit slower since a file can be opened several times. Now universal line opening mode is only used when needed (PS, PDF and plain text), and binary opening mode is used for the other formats. This mean we will be able to remove mmap calls wherever possible, finally.
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`

Line
1	#! /usr/bin/env python
2	# -- coding: ISO-8859-15 --
3	#
4	# pkpgcounter : a generic Page Description Language parser
5	#
6	# (c) 2003, 2004, 2005, 2006, 2007 Jerome Alet <alet@librelogiciel.com>
7	# This program is free software: you can redistribute it and/or modify
8	# it under the terms of the GNU General Public License as published by
9	# the Free Software Foundation, either version 3 of the License, or
10	# (at your option) any later version.
11	#
12	# This program is distributed in the hope that it will be useful,
13	# but WITHOUT ANY WARRANTY; without even the implied warranty of
14	# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15	# GNU General Public License for more details.
16	#
17	# You should have received a copy of the GNU General Public License
18	# along with this program. If not, see <http://www.gnu.org/licenses/>.
19	#
20	# $Id$
21	#
22
23	"""This modules implements a page counter for OpenDocument documents."""
24
25	import sys
26	import zipfile
27
28	import pdlparser
29
30	class Parser(pdlparser.PDLParser) :
31	"""A parser for OpenOffice.org documents."""
32	def isValid(self) :
33	"""Returns True if data is OpenDocument, else False."""
34	if self.firstblock[:2] == "PK" :
35	try :
36	self.archive = zipfile.ZipFile(self.infile)
37	self.contentxml = self.archive.read("content.xml")
38	self.metaxml = self.archive.read("meta.xml")
39	except :
40	return False
41	else :
42	self.logdebug("DEBUG: Input file is in the OpenDocument (ISO/IEC DIS 26300) format.")
43	return True
44	else :
45	return False
46
47	def getJobSize(self) :
48	"""Counts pages in an OpenOffice.org document.
49
50	Algorithm by Jerome Alet.
51	"""
52	pagecount = 0
53	try :
54	# First try with Text documents
55	index = self.metaxml.index("meta:page-count=")
56	pagecount = int(self.metaxml[index:].split('"')[1])
57	except :
58	# Now try with Impress documents
59	pagecount = self.contentxml.count("<draw:page ")
60	if not pagecount :
61	# Probably a Spreadsheet document
62	raise pdlparser.PDLParserError, "OpenOffice.org's spreadsheet documents are not yet supported."
63	return pagecount

Note: See TracBrowser for help on using the browser.

Context Navigation

root / pkpgcounter / trunk / pkpgpdls / ooo.py @ 491

Download in other formats: