स्ट्रिप स्पेस / टैब / न्यूलाइन्स - पायथन

Question 1

मैं लिनक्स पर python 2.7 में सभी रिक्त स्थान / टैब / newlines को निकालने का प्रयास कर रहा हूं।

मैंने यह लिखा है, कि काम करना चाहिए:

myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = myString.strip(' \n\t')
print myString

उत्पादन:

I want to Remove all white   spaces, new lines 
 and tabs

यह एक साधारण सी बात लगती है, फिर भी मुझे यहाँ कुछ याद आ रहा है। क्या मुझे कुछ आयात करना चाहिए?

Question 2

str.split([sep[, maxsplit]])नहीं sepया के साथ प्रयोग करें sep=None:

से डॉक्स :

यदि sepनिर्दिष्ट या नहीं है None, तो एक अलग विभाजन एल्गोरिथ्म लागू किया जाता है: लगातार व्हाट्सएप के रन को एक एकल विभाजक के रूप में माना जाता है, और यदि स्ट्रिंग में अग्रणी या अनुगामी व्हाट्सएप है, तो परिणाम प्रारंभ या अंत में कोई खाली स्ट्रिंग नहीं होगा।

डेमो:

>>> myString.split()
['I', 'want', 'to', 'Remove', 'all', 'white', 'spaces,', 'new', 'lines', 'and', 'tabs']

str.joinइस आउटपुट को प्राप्त करने के लिए लौटी सूची पर उपयोग करें :

>>> ' '.join(myString.split())
'I want to Remove all white spaces, new lines and tabs'

Question 3

यदि आप कई व्हाट्सएप आइटमों को हटाना चाहते हैं और उन्हें सिंगल स्पेस के साथ बदलना चाहते हैं, तो सबसे आसान तरीका इस तरह से एक regexp है:

>>> import re
>>> myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
>>> re.sub('\s+',' ',myString)
'I want to Remove all white spaces, new lines and tabs '

.strip()यदि आप चाहते हैं तो आप उसके बाद अनुगामी स्थान को निकाल सकते हैं।

Question 4

का प्रयोग करें फिर से पुस्तकालय

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = re.sub(r"[\n\t\s]*", "", myString)
print myString

आउटपुट:

IwanttoRemoveallwhitespaces, newlinesandtabs

Question 5

import re

mystr = "I want to Remove all white \t spaces, new lines \n and tabs \t"
print re.sub(r"\W", "", mystr)

Output : IwanttoRemoveallwhitespacesnewlinesandtabs

Question 6

यह केवल टैब को हटा देगा, नए अंक, रिक्त स्थान और कुछ नहीं।

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
output   = re.sub(r"[\n\t\s]*", "", myString)

उत्पादन:

IwantoRemoveallwhiespaces, newlinesandtabs

अच्छा दिन!

Question 7

रेगेक्स के उपयोग का सुझाव देने वाले उपरोक्त समाधान आदर्श नहीं हैं क्योंकि यह इतना छोटा कार्य है और रेगेक्स को कार्य की सादगी की तुलना में अधिक संसाधन ओवरहेड की आवश्यकता होती है।

यहाँ मैं क्या कर रहा हूँ:

myString = myString.replace(' ', '').replace('\t', '').replace('\n', '')

या यदि आपके पास हटाने के लिए चीजों का एक गुच्छा है, ताकि एक ही लाइन समाधान लंबे समय तक हो:

removal_list = [' ', '\t', '\n']
for s in removal_list:
  myString = myString.replace(s, '')

Question 8

चूंकि कुछ और नहीं है जो अधिक जटिल था, मैं इसे साझा करना चाहता था क्योंकि इससे मुझे मदद मिली।

यह वही है जो मैंने मूल रूप से इस्तेमाल किया था:

import requests
import re

url = '/programming/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
print("{}".format(r.content))

अघोषित परिणाम:

b'<!DOCTYPE html>\r\n\r\n\r\n    <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive">\r\n\r\n    <head>\r\n\r\n        <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>\r\n        <link

इसे मैंने इसे बदल दिया है:

import requests
import re

url = '/programming/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
regex = r'\s+'
print("CNT: {}".format(re.sub(regex, " ", r.content.decode('utf-8'))))

वांछित परिणाम:

<!DOCTYPE html> <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive"> <head> <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>

@MattH ने जिस सटीक रेगेक्स का जिक्र किया था, वह मेरे कोड में इसे फिट करने के लिए काम आया था। धन्यवाद!

नोट: यह है python3