pythonxml字元串

發布時間: 2022-12-27 05:44:51

A. 利用python 編程，在多個打包壓縮的文件中搜索指定字元串。有很多xml文件

ziprar.py

__author__='williezh'
#!/usr/bin/envpython3

importos
importsys
importtime
importshutil
importzipfile
fromzipfileimportZIP_DEFLATED


#Zip文件處理類
classZFile(object):
def__init__(self,fname,mode='r',basedir=''):
self.fname=fname
self.mode=mode
ifself.modein('w','a'):
self.zfile=zipfile.ZipFile(fname,mode,compression=ZIP_DEFLATED)
else:
self.zfile=zipfile.ZipFile(fname,self.mode)
self.basedir=basedir
ifnotself.basedir:
self.basedir=os.path.dirname(fname)

defaddfile(self,path,arcname=None):
path=path.replace('//','/')
ifnotarcname:
ifpath.startswith(self.basedir):
arcname=path[len(self.basedir):]
else:
arcname=''
self.zfile.write(path,arcname)

defaddfiles(self,paths):
forpathinpaths:
ifisinstance(path,tuple):
self.addfile(*path)
else:
self.addfile(path)

defclose(self):
self.zfile.close()

defextract_to(self,path):
forpinself.zfile.namelist():
self.extract(p,path)

defextract(self,fname,path):
ifnotfname.endswith('/'):
fn=os.path.join(path,fname)
ds=os.path.dirname(fn)
ifnotos.path.exists(ds):
os.makedirs(ds)
withopen(fn,'wb')asf:
f.write(self.zfile.read(fname))


#創建Zip文件
defcreateZip(zfile,files):
z=ZFile(zfile,'w')
z.addfiles(files)
z.close()


#解壓縮Zip到指定文件夾
defextractZip(zfile,path):
z=ZFile(zfile)
z.extract_to(path)
z.close()


#解壓縮rar到指定文件夾
defextractRar(zfile,path):
rar_command1="WinRAR.exex-ibck%s%s"%(zfile,path)
rar_command2=r'"C:WinRAR.exe"x-ibck%s%s'%(zfile,path)
try:
res=os.system(rar_command1)
ifres==0:
print("PathOK.")
except:
try:
res=os.system(rar_command2)
ifres==0:
print("Successtounrarthefile{}.".format(path))
except:
print('Error:cannotunrarthefile{}'.format(path))


#解壓多個壓縮文件到一個臨時文件夾
defextract_files(file_list):
newdir=str(int(time.time()))
forfninfile_list:
subdir=os.path.join(newdir,fn)
ifnotos.path.exists(subdir):
os.makedirs(subdir)
iffn.endswith('.zip'):
extractZip(fn,subdir)
eliffn.endswith('.rar'):
extractRar(fn,subdir)
returnnewdir


#查找一個文件夾中的某些文件,返迴文件內容包含findstr_list中所有字元串的文件
deffindstr_at(basedir,file_list,findstr_list):
files=[]
forr,ds,fsinos.walk(basedir):
forfninfs:
iffninfile_list:
withopen(os.path.join(r,fn))asf:
s=f.read()
ifall(iinsforiinfindstr_list):
files.append(os.path.join(r,fn))
returnfiles


if__name__=='__main__':
files=[iforiinsys.argv[1:]ifnoti.startswith('-')]
unzipfiles=[iforiinfilesifi.endswith('.zip')ori.endswith('.rar')]
xmlfiles=[iforiinfilesifi.endswith('.xml')]
save_unzipdir=Trueif'-s'insys.argvelseFalse
findstr=[i.split('=')[-1]foriinsys.argvifi.startswith('--find=')]
findstring=','.join(['`{}`'.format(i)foriinfindstr])
newdir=extract_files(unzipfiles)
result=findstr_at(newdir,xmlfiles,findstr)
ifnotresult:
msg='Noneofthefile(s)containthegivenstring{}.'
print(msg.format(findstring))
else:
msg='{}file(s)containthegivenstring{}:'
print(msg.format(len(result),findstring))
print('
'.join([i.replace(newdir+os.sep,'')foriinsorted(result)]))

ifnotsave_unzipdir:
shutil.rmtree(newdir)

$python3ziprar.pyaaa.zipaaa2.zipaaa3.zipaaa.xmlaaa1.xmlaaa2.xml--find="Itwas"--find="when"
Noneofthefile(s)containthegivenstring`Itwas`,`when`.
$python3ziprar.pyaaa.zipaaa2.zipaaa3.zipaaa.xmlaaa1.xmlaaa2.xml--find="Itwas"--find="I"
2file(s)containthegivenstring`Itwas`,`I`:
aaa.zip/aaa2.xml
aaa2.zip/aaa2.xml
$python3ziprar.pyaaa.zipaaa2.zipaaa3.zipaaa.xmlaaa1.xmlaaa2.xml--find="Itwas"
2file(s)containthegivenstring`Itwas`:
aaa.zip/aaa2.xml
aaa2.zip/aaa2.xml

B. python判斷一個字元是否是xml合法字元

#假如你的某些字元是s和asome_letter = ["s","a"]ss = "sadsahchcdsc"other_letters = []for s in ss: if not some_letter.count(s): other_letters.append(s) flag = Trueif other_letters: print "字元串含有別的字元",other_letters

C. python 如何把xml文件轉化成string

你說的不是xml文件吧，是xml對象轉化成string吧。

你可以使用toxml()這個方法。

Node.toxml([encoding])
"""
.
Withnoargument,,andtheresultis

document.-8islikely
incorrect,sinceUTF-8isthedefaultencodingofXML.
Withanexplicitencoding[1]argument,theresultisabytestringinthe
specifiedencoding..To
,the
「utf-8」.
Changedinversion2.3:;seewritexml().
"""

如果解決了您的問題請採納！
如果未解決請繼續追問

D. Python 怎麼解析 xml字元串

1. 我上面這段xml代碼，一開始沒有注意看，在每一個元素的結尾元素中都含有轉義符，這就是為什麼我用xml解析插件時一直保報錯的原因，因為他不是正規的xml格式。我的方法是用正則替換掉：re.sub(r'(<)\\(/.+?>)',r'\g<1>\g<2>',f_xml) 對於Python中的正則re的sub用法
2. 處理成正規的xml格式後，我這里還是用ElementTree來解析的，但在載入時又報錯：
cElementTree.ParseError: XML or text declaration not at start of entity: line 2, column 0
這個錯誤我在網上沒有找到合適的答案，不過根據字面意思來解決，就是在開頭的地方有錯誤。這里我嘗試這吧xml的文檔聲明給去掉了，居然沒有報錯。這里有些不理解為什麼不能加？我的方法：f_xml=test_xml.replace('<?xml version="1.0" encoding="gbk"?>','')
3. 然後再載入，就能獲取到相應的節點了。

E. Python get返回xml解析問題

你這貼的代碼格式都不調一下，看的好痛苦。。。。。

另：貼一份我早期寫的一個解析xml轉換為字典的代碼，支持中文

細微部分，你自己調

importxml.etree.ElementTreeasET
importos
'''
將指定目錄下的xml文件轉換為字典dict
strXmlFileName：xml文件name
strElementPath：xml節點
dictSubElement：dict用於返回
eg.my_dict=xml2dict('xxx.xml','node',my_dict)
'''
defxml2dict(strXmlFileName,strElementPath,dictSubElement):
elementList=[]
dictSubElement.clear()
try:
eTree=ET.parse(os.getcwd()+strXmlFileName)
exceptException,errorinfo:
print"xml2dict:ET.parse(%s)generateexception,errorinfo:%s"%((os.getcwd()+strXmlFileName),errorinfo)
raiseerrorinfo

try:
elementList=eTree.findall(strElementPath)
exceptException,errorinfo:
print"xml2dict:eTree.findall(%s)generateexception,errorinfo:%s"%(strElementPath,errorinfo)
raiseerrorinfo

pathList=[]
forelementinelementList:
forsubelementinelement.getchildren():
#print"tag:%s,text:%s"%(subelement.tag,subelement.text.encode("utf-8"))
ifsubelement.textisnotNone:
ifsubelement.taginpathList:
dictSubElement[subelement.tag]=(os.getcwd()+subelement.text).encode('utf-8')
else:
dictSubElement[subelement.tag]=subelement.text.encode('utf-8')
else:
dictSubElement[subelement.tag]=""#將None賦值一串空字元串

F. python讀取xml文件報錯ValueError: multi-byte encodings are not supported

問題在使用python對xml文件進行讀取時，提示ValueError: multi-byte encodings are not supported

xml是用gb2312編碼的。

很多貼子上說把xml的編碼格式改為utf-8，就可以正常執行了。

但是這里有一個問題，xml原先的編碼格式和encoding欄位顯示的編碼格式都是gb2312，如果只改了encoding欄位，之後再使用這個xml文件，就會按utf-8解析gb2312，會造成不可預知的後果。
第二個問題就是指改一個xml文件還好，但是有幾百上千的時候，改這個就不方便了。
解決方案 用parseString函數
python提供了兩種xml的輸入方式，一種是文件，一種是字元串。我們可以先將xml文件讀入內存，然後關閉文件。再將xml字元串中的gb2312用replace改為utf-8，然後用parseString解析，這樣就不會報錯。

注意事項 如果文件過大，有可能內存不夠，所以適用於小的xml文件。注意要把不使用的文件給close掉，以免佔用文件描述符。

G. 如何用python提取XML中的注釋

from xml.etree import ElementTreestr_ = '' #文件中的xml字元串xml_obj = ElementTree.fromstring(str_)

然後通過對xml_obj進行操作，xml_obj本身也是一個xml節點。
xml_obj.getchildren() 獲取根節點的子節點列表
xml_obj.findall(node_name) 搜索xml_obj節點下名為node_name的所有節點
xml_obj.tag 節點的標簽
xml_obj.text 節點的文本信息，本例中可以獲得K這個文本。
xml_obj.tail 節點尾部的文本信息，本例中獲取Channel Regulator KCR1 Suppresses Heart Rhythm by Molating the Pacemaker Current I 就需要搜索到標簽為sup的節點，然後取節點的tail文本獲得。

H. 用python解析XML格式的字元串

你這樣的數據還沒有用正則來的簡單
r'(?<=\<Result\>)(.+?)(?=\</Result\>)'

用XML會比較麻煩：
dom1 = minidom.parseString(xml)
result = dom1.getElementsByTagName("Result")
result = result[0].childNodes[0].nodeValue

I. python 解析xml 對xml字元串的長度有要求嗎

沒有，只對xml的格式有要求。

如果解決了您的問題請採納！
如果未解決請繼續追問！

閱讀全文

熱點內容

java返回this 發布：2025-10-20 08:28:16 瀏覽：587

製作腳本網站發布：2025-10-20 08:17:34 瀏覽：882

python中的init方法發布：2025-10-20 08:17:33 瀏覽：574

圖案密碼什麼意思發布：2025-10-20 08:16:56 瀏覽：761

怎麼清理微信視頻緩存發布：2025-10-20 08:12:37 瀏覽：678

c語言編譯器怎麼看執行過程發布：2025-10-20 08:00:32 瀏覽：1006

郵箱如何填寫發信伺服器發布：2025-10-20 07:45:27 瀏覽：251

shell腳本入門案例發布：2025-10-20 07:44:45 瀏覽：108

怎麼上傳照片瀏覽上傳發布：2025-10-20 07:44:03 瀏覽：799

python股票數據獲取發布：2025-10-20 07:39:44 瀏覽：706

pythonxml字元串

與pythonxml字元串相關的資訊