Python编程快速上手 — 让繁琐工作自动化

基础概念

导入模块

import string
import random,sys,os

from random import *

使用这种形式的 import 语句，调用 random 模块中的函数时不需要 random.前缀。但是，使用完整的名称会让代码更可读，所以最好是使用普通形式的 import 语句。

声明函数

不带参数

def Hello():
    print("----hello world.----")

Hello()

带有参数

输出中文需要在开头加入# -*- coding: UTF-8 -*- 或者 #coding=utf-8

def Hello(name):
    print("----hello world.----" + name)

Hello("Yunis")
Hello("三十一")

带有多个参数

def Hello(name,age):
    print( name + " is " + str(age) + " years old this year")

Hello("Yunis",18)
Hello("三十一",25)

关键字参数和 `print()`

  print('Hello')
  print('World')

输出为：

Hello 
World

这是因为 print()函数自动在传入的字符串末尾添加了换行符。可以通过 end 关键字参数，设置这个末尾。

print('Hello', end='')
print('World')

输出为：

HelloWorld

类似地，如果向 print()传入多个字符串值,该函数就会自动用一个空格分隔它们。

>>> print('cats', 'dogs', 'mice') 
>>> cats dogs mice

可以传入 sep 关键字参数，替换掉默认的分隔字符.

>>> print('cats', 'dogs', 'mice', sep=',')
>>> cats,dogs,mice

作用域

全局作用域中的代码不能使用任何局部变量;
但是，局部作用域可以访问全局变量;
一个函数的局部作用域中的代码，不能使用其他局部作用域中的变量。 ⁃ * 如果在不同的作用域中，你可以用相同的名字命名不同的变量。也就是说，可以有一个名为 spam 的局部变量，和一个名为 spam 的全局变量。

如果需要在一个函数内修改全局变量，就使用 global 语句。

def spam():
    global eggs
    eggs = 'spam'
      
eggs = 'global'
spam()
print(eggs)//spam

如果变量在全局作用域中使用(即在所有函数之外)，它就总是全局变量。
如果在一个函数中，有针对该变量的 global 语句，它就是全局变量。
否则，如果该变量用于函数中的赋值语句，它就是局部变量。
但是，如果该变量没有用在赋值语句中，它就是全局变量。

异常处理

错误可以由 try 和 except 语句来处理。那些可能出错的语句被放在 try 子句中。如果错误发生，程序执行就转到接下来的 except 子句开始处。

def spam(divideBy):
    try:
        return 42 / divideBy 
    except ZeroDivisionError:
        print('Error: Invalid argument.')
        
print(spam(2))
print(spam(12))
print(spam(1))

#Error: Invalid argument. 
#None
print(spam(0))

列表

下标取值

spam = ['cat', 'bat','dog']
print(spam[0])  #cat
print(spam[-1]) #dog

	cat	bat	dog
正向	0	1	2
负向	-3	-2	-1

切片获取子列表

spam = ['cat', 'bat','dog']

print(spam[:-1])  #['cat', 'bat']
print(spam[0:-1]) #['cat', 'bat']
print(spam[0:1])  #['cat']
print(spam[:1])   #['cat']
print(spam[0:2])  #['cat', 'bat']
print(spam[0:3])  #['cat', 'bat', 'dog']

列表链接、复制

+ 操作符可以连接两个列表，得到一个新列表。

spam = ['cat', 'bat','dog'] + [1,2,3]
print(spam)  #['cat', 'bat', 'dog', 1, 2, 3]

多重赋值技巧

spam = ['cat', 'bat','dog']
x,y,z = spam
print(x)  #cat
print(y)  #bat
print(z)  #dog

元组、字符串、列表转换

>>> tuple(['cat', 'dog', 5]) 
 ('cat', 'dog', 5)
>>> list(('cat', 'dog', 5)) 
 ['cat', 'dog', 5]
>>> list('hello')
['h', 'e', 'l', 'l', 'o']

自动化任务

正则表达式

Python 中所有正则表达式的函数都在 re 模块中。

? 匹配0次或者一次
* 匹配0次或者多次
+ 匹配1次或者多次
{n}匹配 n 次前面的分组。
{n,}匹配 n 次或更多前面的分组。
{,m}匹配零次到 m 次前面的分组。
{n,m}匹配至少 n 次、至多 m 次前面的分组。
{n,m}?或*?或+?对前面的分组进行非贪心匹配。
^spam 意味着字符串必须以 spam 开始。
spam$意味着字符串必须以 spam 结束。
.匹配所有字符，换行符除外。
\d、\w 和\s 分别匹配数字、单词和空格。
\D、\W 和\S 分别匹配出数字、单词和空格外的所有字符。
[abc]匹配方括号内的任意字符(诸如 a、b 或 c)。
[^abc]匹配不在方括号内的任意字符。
用 sub()方法替换字符串

>>> namesRegex = re.compile(r'Agent \w+')
>>> namesRegex.sub('CENSORED', 'Agent Alice gave the secret documents to Agent Bob.') 

'CENSORED gave the secret documents to CENSORED.'

有时候，你可能需要使用匹配的文本本身，作为替换的一部分。在 sub()的第一个参数中，可以输入\1、\2、\3……。表示“在替换中输入分组 1、2、3……的文本”。

>>> agentNamesRegex = re.compile(r'Agent (\w)\w*')
>>> agentNamesRegex.sub(r'\1****', 'Agent Alice told Agent Carol that Agent Eve knew Agent Bob was a double agent.')

A**** told C**** that E**** knew B**** was a double agent.'

读写文件

文件及文件路径

相对路径：相对于程序的当前工作目录。
绝对路径：总是从根文件开始。

os.makedirs() 创建文件夹

调用 os.path.abspath(path)将返回参数的绝对路径的字符串。这是将相对路径转换为绝对路径的简便方法。
调用 os.path.isabs(path)，如果参数是一个绝对路径，就返回 True，如果参数是一个相对路径，就返回 False。
调用 os.path.relpath(path, start)将返回从 start 路径到 path 的相对路径的字符串。如果没有提供 start，就使用当前工作目录作为开始路径。
调用 os.path.getsize(path)将返回 path 参数中文件的字节数。
调用 os.listdir(path)将返回文件名字符串的列表，包含 path 参数中的每个文件 (请注意，这个函数在 os 模块中，而不是 os.path)。
检查路径有效性
- 如果 path 参数所指的文件或文件夹存在，调用 os.path.exists(path)将返回 True，否则返回 False
- 如果 path 参数存在，并且是一个文件，调用 os.path.isfile(path)将返回 True，否则返回 False
- 如果 path 参数存在，并且是一个文件夹，调用 os.path.isdir(path)将返回 True，否则返回 False。

=== 在 Python 中，读写文件有 3 个步骤:

调用 open() 函数，返回一个 File 对象。
调用 File 对象的 read()或 write()方法。
调用 File 对象的 close()方法，关闭该文件。

helloFile = open('/Users/your_home_folder/hello.txt')

读文件
helloContent = helloFile.read()

写文件
baconFile = open('bacon.txt', 'w')
baconFile.write('Hello world!\n')
baconFile.close()

项目：生成随机试卷及答案

创建35个试卷文件及对应的答案文件。
为每一个试卷随机生成50道题目，次序随机。
为每一个问题提供一个正确答案和三个错误答案，次序随机。
将测试题目写入试卷文件，将对应的答案写入相应的答案文件中。

# -*- coding: UTF-8 -*-
import random
import os

# 进入工作目录
os.chdir('/Users/Yunis/Desktop/学习/Python/random')
# 打印当前工作目录
print("当前工作目录为 %s"%(os.getcwd()))

# 首都以及首都集合字典 习题集及答案
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix','Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver', 'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee', 'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois': 'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas': 'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine': 'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan': 'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri': 'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada': 'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe', 'New York': 'Albany', 'North Carolina': 'Raleigh',
'North Dakota': 'Bismarck', 'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City', 'Oregon': 'Salem', 'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence', 'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee': 'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont': 'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# 循环35次 因为要出35分试卷
for quizNum in range(35):
    quizFile = open('capitalsquiz%s.txt' % (quizNum + 1), 'w')
    answerKeyFile = open('capitalsquiz_answers%s.txt' % (quizNum + 1), 'w')
    quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
    quizFile.write((' ' * 20) + 'State Capitals Quiz (Form %s)' % (quizNum + 1))
    quizFile.write('\n\n')
    # 获取字典 keys
    states = list(capitals.keys())
    # 用于将一个列表中的元素打乱。
    random.shuffle(states)

    # 每份试卷有50道题目
    for questionNum in range(50):
        # Get right and wrong answers.
        correctAnswer = capitals[states[questionNum]]
        # 获取答案列表
        wrongAnswers = list(capitals.values())
        # 删除正确的答案选项
        del wrongAnswers[wrongAnswers.index(correctAnswer)]
        # 从错误的答案选项随机选出3个错误的答案
        wrongAnswers = random.sample(wrongAnswers, 3)
        # 给当前题目设置答案集合
        answerOptions = wrongAnswers + [correctAnswer]
        # 打乱 答案顺序
        random.shuffle(answerOptions)
        # 在试卷文件 写入 题号 以及题目
        quizFile.write('%s. What is the capital of %s?\n' % (questionNum + 1,states[questionNum]))
        # 在试卷文件 写入答案选项
        for i in range(4):
            quizFile.write(' %s. %s\n' % ('ABCD'[i], answerOptions[i]))
        quizFile.write('\n')
        # 把正确答案的题号以及正确答案写入答案文件
        answerKeyFile.write('%s. %s\n' % (questionNum + 1, 'ABCD'[
        answerOptions.index(correctAnswer)]))

    # 关闭试卷文件
    quizFile.close()
    # 关闭答案文件
    answerKeyFile.close()

项目：多重剪贴板

.pyw 扩展名意味着 Python 运行该程序时，不会显示终端窗口。

# -*- coding: UTF-8 -*-
# mcb.pyw - Saves and loads pieces of text to the clipboard.
#  Usage: py.exe mcb.pyw save <keyword> - Saves clipboard to keyword.
#         py.exe mcb.pyw <keyword> - Loads keyword to clipboard.
#         py.exe mcb.pyw list - Loads all keywords to clipboard.
import shelve, pyperclip, sys
# .pyw 扩展名意味着 Python 运行该程序时，不会显示终端窗口

# 进入工作目录
os.chdir('/Users/Yunis/Desktop/学习/Python/mcb')
# 打印当前工作目录
print("当前工作目录为 %s"%(os.getcwd()))

# shelve是一额简单的数据存储方案，他只有一个函数就是open()，这个函数接收一个参数就是文件名，然后返回一个shelf对象，你可以用他来存储东西，就可以简单的把他当作一个字典，当你存储完毕的时候，就调用close函数来关闭

# 建立一个存储容器
mcbShelf = shelve.open('mcb')

# 判断参数有几个，
if len(sys.argv) == 3:
    # 判断第2个参数是否为save
    if sys.argv[1].lower() == 'save' :
        # 把粘贴板的值保存在 mcbShelf
        mcbShelf[sys.argv[2]] = pyperclip.paste()
    elif  sys.argv[1].lower() == 'del' :
        # 删除关键字对应的值
        mcbShelf.pop(sys.argv[2])
elif len(sys.argv) == 2:
    if sys.argv[1].lower() == 'list':
        # 把容器中保存的key 填充到粘贴板
        pyperclip.copy(str(list(mcbShelf.keys())))
    elif sys.argv[1] in mcbShelf:
        # 根据传的参数，把容器中对应key的值取出来放置在粘贴板中
        pyperclip.copy(mcbShelf[sys.argv[1]])
# 关闭
mcbShelf.close()

组织文件

shutil 模块

shutil.copy(source, destination)
1. 将路径 source 处的文件复制到路径 destination 处的文件夹(source 和 destination 都是字符串)
2. 如果 destination 是一个文件名，它将作为被复制文件的新名字。
3. 该函数返回一个字符串，表示被复制文件的路径。
shutil.copytree(source, destination)
1. 将路径source处的文件夹，包括它的所有文件和子文件夹，复制到路径 destination 处的文件夹。
2. source 和 destination 参数都是字符串。
3. 该函数返回一个字符串，是新复制的文件夹的路径。
shutil.move(source, destination)
1. 将路径 source 处的文件夹移动到路径 destination，并返回新位置的绝对路径的字符串。
2. 如果 destination 指向一个文件夹，source 文件将移动到 destination 中，并保持原来的文件名。
3. destination 路径也可以指定一个文件名。source 文件被移动并改名。
用 os.unlink(path)将删除 path 处的文件。
调用 os.rmdir(path)将删除 path 处的文件夹。该文件夹必须为空，其中没有任何文件和文件夹。
调用 shutil.rmtree(path)将删除 path 处的文件夹，它包含的所有文件和文件夹都会被删除。
os.walk()函数被传入一个字符串值，即一个文件夹的路径。你可以在一个 for 循环语句中使用 os.walk()函数，遍历目录树，就像使用 range()函数遍历一个范围的数字一样。
1. 当前文件夹名称的字符串。
2. 当前文件夹中子文件夹的字符串的列表。
3. 当前文件夹中文件的字符串的列表。

# -*- coding: UTF-8 -*-
import os
for folderName, subfolders, filenames in os.walk('/Users/Yunis/Desktop/学习/Python'):
    print('The current folder is ' + folderName)
    for subfolder in subfolders:
        print('SUBFOLDER OF ' + folderName + ': ' + subfolder)
    for filename in filenames:
        print('FILE INSIDE ' + folderName + ': '+ filename)
    print('')

zipfile 模块

读取 ZIP 文件

# -*- coding: UTF-8 -*-
import os,zipfile

os.chdir('/Users/Yunis/Desktop/学习/Python/zipfile')
exampleZip = zipfile.ZipFile('t.txt.zip')
# 获取压缩文件集合
print(exampleZip.namelist())
# 获取指定文件信息
spamInfo = exampleZip.getinfo('t.txt')
# 文件真实大小
print(spamInfo.file_size)
# 文件压缩后的大小
print(spamInfo.compress_size)
exampleZip.close()

从 ZIP 文件中解压缩

extractall()
- extractall()方法从 ZIP 文件中解压缩所有文件和文件夹
- 可以向 extractall()传递的一个文件夹名称，它将文件解压缩到那个文件夹，而不是当前工作目录。
extract()
- extract()方法从 ZIP 文件中解压缩单个文件
- 传递给 extract()的字符串，必须匹配 namelist()返回的字符串列表中的一个。
- extract()传递第二个参数，将文件解压缩到指定的文件夹，而不是当前工作目录。如果第二个参数指定的文件夹不存在，Python 就会创建它。
- extract() 的返回值是被压缩后文件的绝对路径。

# -*- coding: UTF-8 -*-
import os,zipfile
os.chdir('/Users/Yunis/Desktop/学习/Python/zipfile')
exampleZip = zipfile.ZipFile('t.txt.zip')
# 解压缩所用文件和文件夹 'Yunis' 为路径参数，表示解压缩到这个路径，参数为空表示解压缩到当前路径
# exampleZip.extractall('Yunis')
# 解压缩 单个't.txt' 文件到 'SSY' 路径下，路径为空解压到当前路径
exampleZip.extract('t.txt','SSY')
exampleZip.close()

创建和添加到 ZIP 文件

要创建你自己的压缩 ZIP 文件，必须以“写模式”打开 ZipFile 对象，即传入’w’ 作为第二个参数(这类似于向 open()函数传入’w’，以写模式打开一个文本文件)。

exampleZip = zipfile.ZipFile('t.txt.zip','w')

# -*- coding: UTF-8 -*-
import os,zipfile
os.chdir('/Users/Yunis/Desktop/学习/Python/zipfile')
# 创建压缩文件
newZip = zipfile.ZipFile('new.zip','w')
# 把 t.txt 文件 写入 压缩文件内 compress_type ：压缩方式
newZip.write('t.txt', compress_type=zipfile.ZIP_DEFLATED)
newZip.close()

将带有美国风格日期的文件改名为欧洲风格日期

# -*- coding: UTF-8 -*-
import shutil,os,re

def renameDatesInpath(path):
    # 获取文件路径下的文件集合
    # print(path)
    # 创建正则表达式
    datePattern = re.compile(r'^(.*?)((0|1)\d)-((0|1|2|3)\d)-((19|20)\d\d)(.*?)$')

    # 获取文件路径下的文件集合
    for amerFilename in os.listdir(path):
        # 正则表达式对象匹配结果
        # Regex 对象的 search()方法查找传入的字符串，寻找该正则表达式的所有匹配。如 果字符串中没有找到该正则表达式模式，search()方法将返回 None。如果找到了该模式， search()方法将返回一个 Match 对象。Match 对象有一个 group()方法，它返回被查找字 符串中实际匹配的文本
        mo = datePattern.search(amerFilename)
        # 如果这个文件没有匹配，开始匹配下个文件
        if mo == None :
            continue
        # re.compile(r'^(.*?)((0|1)\d)-((0|1|2|3)\d)-((19|20)\d\d)(.*?)$')
        # group index
        # datePattern = re.compile(r'^(1)((2)3)-((4)5)-((6)7)(8)$')

        # 获取 Match 对象 对应的字符串
        beforePart = mo.group(1)
        monthPart = mo.group(2)
        dayPart = mo.group(4)
        yearPart = mo.group(6)
        afterPart = mo.group(8)
        # 组成新的文件名
        euroFilename = beforePart + dayPart + '-' + monthPart + '-' + yearPart + afterPart


        absWorkingDir = os.path.abspath(path)
        amerFilename = os.path.join(absWorkingDir, amerFilename)
        euroFilename = os.path.join(absWorkingDir, euroFilename)
        print('Renaming "%s" to "%s"...' % (amerFilename, euroFilename))
        shutil.move(amerFilename, euroFilename) # uncomment after testing

renameDatesInpath('/Users/Yunis/Desktop/学习/Python/date')

从 Web 抓取信息

使用 requests 下载文件，并保存到硬盘。

调用 requests.get()下载该文件。
用’wb’调用 open()，以写二进制的方式打开一个新文件。
利用 Respose 对象的 iter_content()方法做循环。
在每次迭代中调用 write()，将内容写入该文件。
调用 close()关闭该文件。

# -*- coding: UTF-8 -*-
import requests,os,sys

currentFilePath = sys.path[0]
webFilePath = os.path.join(currentFilePath, "file")
os.chdir(webFilePath)
print(os.getcwd())
res = requests.get("http://static.open-open.com/lib/uploadImg/20160623/20160623173015_416.png")
playFile = open('python.jpg', 'wb')
for chunk in res.iter_content(100000):
    playFile.write(chunk)
playFile.close()

用 BeautifulSoup 模块解析 HTML

传递给 select()方法的选择器	将匹配…
`soup.select('div')`	所有名为`<div>`的元素
`soup.select('#author')`	带有 `id` 属性为 `author` 的元素
`soup.select('.notice')`	所有使用 `CSS class` 属性名为 `notice` 的元素
`soup.select('div span')`	所有在`<div>`元素之内的`<span>`元素
`soup.select('div > span')`	所有直接在`<div>`元素之内的`<span>`元素，中间没有其他元素
`soup.select('input[name]')`	所有名为`<input>`，并有一个 `name` 属性，其值无所谓的元素
`soup.select('input[type="button"]')`	所有名为`<input>`，并有一个 `type` 属性，其值为 `button` 的元素

这段代码将带有 id="author"的元素，从示例 HTML 中找出来

elems = exampleSoup.select('#author')

“I’m Feeling Lucky” 百度查找

Python小工具-根据输入关键字自动打开百度搜索结果的第一页

项目:下载所有 XKCD 漫画

下载所有 XKCD 漫画

Python编程快速上手 — 让繁琐工作自动化<一>

在不断填坑中前进。。

Python编程快速上手 — 让繁琐工作自动化

基础概念

导入模块

声明函数

不带参数

带有参数

带有多个参数

关键字参数和 `print()`

作用域

异常处理

列表

下标取值

切片获取子列表

列表链接、复制

多重赋值技巧

元组、字符串、列表转换

自动化任务

正则表达式

用 `sub()`方法替换字符串

读写文件

文件及文件路径

项目：生成随机试卷及答案

项目：多重剪贴板

组织文件

shutil 模块

zipfile 模块

读取 ZIP 文件

从 ZIP 文件中解压缩

创建和添加到 ZIP 文件

将带有美国风格日期的文件改名为欧洲风格日期

从 Web 抓取信息

用 BeautifulSoup 模块解析 HTML

“I’m Feeling Lucky” 百度查找

项目:下载所有 XKCD 漫画

CATALOG

FEATURED TAGS

FRIENDS

Python编程快速上手 — 让繁琐工作自动化

基础概念

导入模块

声明函数

不带参数

带有参数

带有多个参数

关键字参数 和 print()

作用域

异常处理

列表

下标取值

切片获取子列表

列表链接、复制

多重赋值技巧

元组 、字符串 、列表 转换

自动化任务

正则表达式

用 sub()方法替换字符串

读写文件

文件及文件路径

项目： 生成随机试卷及答案

项目： 多重剪贴板

组织文件

shutil 模块

zipfile 模块

读取 ZIP 文件

从 ZIP 文件中解压缩

创建和添加到 ZIP 文件

将带有美国风格日期的文件改名为欧洲风格日期

从 Web 抓取信息

用 BeautifulSoup 模块解析 HTML

“I’m Feeling Lucky” 百度 查找

项目:下载所有 XKCD 漫画

CATALOG

FEATURED TAGS

FRIENDS

关键字参数和 `print()`

元组、字符串、列表转换

用 `sub()`方法替换字符串

项目：生成随机试卷及答案

项目：多重剪贴板

“I’m Feeling Lucky” 百度查找