03 - 串列

  • 序列型別
  • 可變與不可變
  • 序列共同介面
  • 序列的應用
  • 附錄-進階練習

序列型別

常見序列型別的有字串(string)、串列(list)

來源:wiki

串列(list)

  • 可以直接給定有值的串列
  • 也可以給空串列
  • 可以相+、相*

!注意

  • 串列的組成可以不必都是同樣類型的元素
In [1]:
my_list1 = ["a",2016,5566,"PyLadies"]
my_list2 = []
my_list3 = my_list1+[2016,2016.0]
my_list4 = [1,2,3]*3
print (my_list1,bool(my_list1))
print (my_list2,bool(my_list2))
print (my_list3)
print (my_list4)
['a', 2016, 5566, 'PyLadies'] True
[] False
['a', 2016, 5566, 'PyLadies', 2016, 2016.0]
[1, 2, 3, 1, 2, 3, 1, 2, 3]

存取串列元素

  • my_list[i]:取得索引(index)在i的元素
  • 索引可以是負數,如果是索引為-i,會被當作拿取索引為「串列長度-i」的元素

!注意

  • Python 是從0開始數
In [2]:
# 索引值     0 , 1  ,  2 ,    3     ,  4 ,  5   
my_list = ["a",2016,5566,"PyLadies",2016,2016.0] # 這是一個長度為6的串列
print ("The 4th  element",my_list[3])
print ("The last element",my_list[-1])           # 等同於拿取索引=6-1=5的元素
print ("The second-last element",my_list[-2])    # 等同於拿取索引=6-2=4的元素
The 4th  element PyLadies
The last element 2016.0
The second-last element 2016
In [3]:
my_list = ["a",2016,5566,"PyLadies",2016,2016.0]
b = my_list[1]
my_list[2] = 2017
print(b)
print(my_list)
2016
['a', 2016, 2017, 'PyLadies', 2016, 2016.0]

「字串」跟「串列」很像

  • 字串的每個元素都是一個字元(字母或符號)
  • 用同樣的方式存取元素
In [4]:
my_string = "PyLadies Taiwan"
print ("The 1st  element of my_string = ",my_string[0])
print ("The 8th  element of my_string = ",my_string[7])
print ("The last element of my_string = ",my_string[-1]) 
The 1st  element of my_string =  P
The 8th  element of my_string =  s
The last element of my_string =  n

!注意

  • 索引不能超過界線
In [5]:
print(my_string[20])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-071b8d4e8c76> in <module>()
----> 1 print(my_string[20])

IndexError: string index out of range

[練習]

這裡有一群按照身高排序的人,可以告訴我最高和最矮差了幾公分嗎?
(請不要直接 print(192-145),試試看剛才學的串列取值)

people = [145,148,151,153,158,161,163,164,166,168,170,172,175,192]

Hint:

  • 因為身高是有排序過的,最矮是第一個元素,最高是最後一個
    => 最後一個減第一個

可變與不可變

intfloatstr、是不可變(immutable)物件,在建立之後就不能更改他的值,
list是可變(mutable)物件。

!注意

以下這段程式碼,是讓a這個名稱,指向別的物件,
也就是把標籤撕下來貼到別人身上,並不是更改a的值喔!

a = 3  
a = 4

原先的3這個物件,如果沒有其他名稱指到他,就會成為垃圾,被系統回收。

判斷是否指向同一個物件

  • s is ts is not t
In [6]:
a = 3
b = 3
c = a
print(a is b,a is c)
print(id(a),id(b),id(c))
a += 2  # a = 5
b = 4   
print(a is b,a is c)
print(id(a),id(b),id(c))
True True
4383230064 4383230064 4383230064
False False
4383230128 4383230096 4383230064

不可變物件,進行運算、或是重新指向,都是直接指向新的物件。

In [7]:
l1 = ["a",2016,5566,"PyLadies"]
l2 = l1
print(l1 is l2,id(l1),id(l2))
True 4432372232 4432372232
In [8]:
l1 += ["Hi"]
print(l1 is l2,id(l1),id(l2))
print(l1)
print(l2)
True 4432372232 4432372232
['a', 2016, 5566, 'PyLadies', 'Hi']
['a', 2016, 5566, 'PyLadies', 'Hi']
In [9]:
l2[2] = 2017
print(l1 is l2,id(l1),id(l2))
print(l1)
print(l2)
True 4432372232 4432372232
['a', 2016, 2017, 'PyLadies', 'Hi']
['a', 2016, 2017, 'PyLadies', 'Hi']

!注意

  • 可變物件不管怎麼修改,位置還是不會變的
  • 如果有兩個名稱指向同一個可變物件,物件一修改,
    兩個名稱取到的都會是同樣一個修改之後的物件

序列共同介面

前面提到的利用 序列[索引] 存取元素,也是共同介面之一!

序列長度

len(s)

In [10]:
my_string = "PyLadies Taiwan"
my_list = ["a",2016,5566,"PyLadies",2016,2016.0]
print ("Length of my_string = ",len(my_string))
print ("Length of my_list = ",len(my_list))
Length of my_string =  15
Length of my_list =  6

數值計算

  • max(串列):序列中最大值
  • min(串列):序列中最小值
  • sum(串列):序列總和,針對全數字序列才有用
In [11]:
l = [3, 4, 2.1, 1]
s = "PyLadies"
print(max(l),min(l),sum(l))
print(max(s),min(s))
print(sum(s))
4 1 10.1
y L
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-4b09b5b7b432> in <module>()
      3 print(max(l),min(l),sum(l))
      4 print(max(s),min(s))
----> 5 print(sum(s))

TypeError: unsupported operand type(s) for +: 'int' and 'str'

[練習]

統計全班的成績狀況:

  • 全班最高分與最低分的落差為? (Ans. 58)
  • 全班分數的平均為? (Ans. 71.4285...)
score_of_each_student = [85,70,54,87,98,66,40]

Hint:

  • 平均:學生分數總和/學生總數
  • 總數:序列長度
  • 最高分:序列最大值
  • 最低分:序列最小值

判斷序列中是否存在某元素

x in sx not in s

In [13]:
my_string = "PyLadies Taiwan"

if "PyLadies" in my_string:
    print ("\"PyLadies\" found")
if "Python" in my_string:
    print ("\"Python\" found")
if "Taiwan" in my_string:
    print ("\"Taiwan\" found")
"PyLadies" found
"Taiwan" found
In [14]:
my_list = ["a",2016,5566,"PyLadies",2016,2016.0]
print(2016 in my_list)
print("2016" in my_list)
True
False

序列出現某元素的次數

s.count(x)

In [15]:
my_string = "PyLadies Taiwan"
my_list = ["a",2016,5566,"PyLadies",2016,2016.0]
print ("The time 'a' appears in my_string = ",my_string.count('a'))
print ("The time '2016' appears in my_string = ",my_list.count(2016))
The time 'a' appears in my_string =  3
The time '2016' appears in my_string =  3

想想看

要用什麼序列方法,找出以下這篇文章,出現幾次「in」

In [16]:
article = """
Bubble tea represents the "QQ" food texture that Taiwanese love. 
The phrase refers to something that is especially chewy, like the tapioca balls that form the 'bubbles' in bubble tea. 
It's said this unusual drink was invented out of boredom. 
"""
print (article.count("in"))
4

「in」的確在文章出現4次:someth"in"g、"in"、dr"in"k、"in"vented
但如果我們想要的是代表單字的「in」,就需要把文章分割,再來計算!

序列的應用

分割

串列 = 原字串.split(子字串,最多分割次數)

  • 將「原字串」以「子字串」切割,產生一個新的「串列」(不改變原本字串)
  • 最多分割次數預設為無限
In [17]:
s = "Hi PyLadies Taiwan"
l = s.split(" ")
print (l)
print (s.split(" ",1))
print (s)
print (l[0])
print (l[-1])
['Hi', 'PyLadies', 'Taiwan']
['Hi', 'PyLadies Taiwan']
Hi PyLadies Taiwan
Hi
Taiwan

[練習]

所以再想想看要用哪些序列方法,找出以下這篇文章,出現幾次單字「in」

  • 把文章切割成單字串列 (英文單字間會有一個空白字元)
  • 算出單字串列中,「in」出現幾次
In [18]:
article = """
Bubble tea represents the "QQ" food texture that Taiwanese love. 
The phrase refers to something that is especially chewy, like the tapioca balls that form the 'bubbles' in bubble tea. 
It's said this unusual drink was invented out of boredom. 
"""

那這篇文章出現幾次「tea」呢?

In [20]:
article = """
Bubble tea represents the "QQ" food texture that Taiwanese love. 
The phrase refers to something that is especially chewy, like the tapioca balls that form the 'bubbles' in bubble tea. 
It's said this unusual drink was invented out of boredom. 
"""

可是文章中我們有看到兩次的tea,怎麼只算一次?
原因是因為split之後,是分成「tea」、「tea.」
這兩個是不同的字串,以count("tea")來說,就只會算到剛好等於「tea」的字

所以可以先將常用標點符號取代之後再來計算!

[練習]

所以再想想看要用哪些序列方法,找出以下這篇文章,出現幾次單字「tea」

  • 把標點符號取代
  • 把文章切割成單字
  • 算出單字中,「tea」出現幾次

同樣道理,
如果是用要用in判斷,找「單字」是否存在字串中,
記得先分割成單字的字串才使用

In [23]:
my_string = "PyLadies Taiwan"
if "Py" in my_string:
    print ("\"Py\" found in my_string")

my_string_list = my_string.split(" ")
if "Py" in my_string_list:
    print ("\"Py\" found in my_string_list")
"Py" found in my_string

應用情境 - 日期正規化 (年-月-日 時:分:秒)

In [24]:
ebc_datetime = "2016-10-17 17:00"
back_datetime = "2016-10-11, 19:55"
ptt_datatime = "Tue Oct 18 23:22:05 2016"

format_ebc_datetime = ebc_datetime+":00"
format_back_datetime = back_datetime.replace(",","")+":00"

# ptt
ptt_split_list = ptt_datatime.split(" ")
ptt_year = ptt_split_list[-1]
ptt_month = ptt_split_list[1].replace("Oct","10")
ptt_date = ptt_split_list[2]
ptt_time = ptt_split_list[3]
format_ptt_datetime = "{}-{}-{} {}".format(ptt_year,ptt_month,ptt_date,ptt_time)

print (format_ebc_datetime)
print (format_back_datetime)
print (format_ptt_datetime)
2016-10-17 17:00:00
2016-10-11 19:55:00
2016-10-18 23:22:05

應用情境 - 日期正規化 (年-月-日 時:分:秒) 24小時制

In [25]:
yahoo_datetime = "2016年10月18日 下午10:33"
temp_yahoo_date = yahoo_datetime.split(" ")[0].replace("年","-").replace("月","-")
temp_yahoo_date = temp_yahoo_date.replace("日","")
temp_yahoo_time = yahoo_datetime.split(" ")[-1]
temp_yahoo_hour = temp_yahoo_time.split(":")[0]
temp_yahoo_mins = temp_yahoo_time.split(":")[-1]

if "下午" in temp_yahoo_hour:
    temp_yahoo_hour_int = int(temp_yahoo_hour.replace("下午",""))+12
    temp_yahoo_hour = str(temp_yahoo_hour_int)
else:
    temp_yahoo_hour_int = int(temp_yahoo_hour.replace("上午",""))
    temp_yahoo_hour = str(temp_yahoo_hour_int)

format_yahoo_datetime = "{} {}:{}:00".format(temp_yahoo_date,temp_yahoo_hour,temp_yahoo_mins)
print (format_yahoo_datetime)
    
2016-10-18 22:33:00

組合

字串 = 間隔字串.join(序列)

  • 與split分割相反,是將字串串列以某個字串組合起來
  • 針對全字串串列才有用
In [26]:
l = ["Hello","PyLadies","Taiwan"]
s = "Hi PyLadies"
print (" ".join(l))
print (".".join(s))
Hello PyLadies Taiwan
H.i. .P.y.L.a.d.i.e.s

應用情境 - 正規化 ('人名1','人名2','人名3')

  • 常用在SQL語法
In [27]:
l = ["Kelly","Mars","Maomao"]
s = "','".join(l)
print(s)
print("('{}')".format(s))
Kelly','Mars','Maomao
('Kelly','Mars','Maomao')

想想看

如果我要算這篇文章,每個單字出現幾次:

In [28]:
article = """Bubble tea represents the "QQ" food texture that Taiwanese love. 
The phrase refers to something that is especially chewy, like the tapioca balls that form the 'bubbles' in bubble tea. 
It's said this unusual drink was invented out of boredom. 
"""
word_of_article = article.split(" ")
print(word_of_article[0],word_of_article.count(word_of_article[0]))
print(word_of_article[1],word_of_article.count(word_of_article[1]))
print(word_of_article[2],word_of_article.count(word_of_article[2]))
Bubble 1
tea 1
represents 1

總不能一個個慢慢寫吧...
下一章節就會教到程式語言中很厲害的技巧:「迴圈」,
在任何程式語言中「串列」+「迴圈」基本上可以解決大部分的問題!

附錄-進階練習

簡易計算機

輸入一行文字,印出計算結果

  • 只做加減乘除、次方、餘數
  • 只會有兩個數字,與一個運算元

ex:

  • 11+2 -> 13
  • 2**3 -> 8
  • 10%3 -> 1

Hint:

keyin = input("請輸入您要計算的內容,限二個數字 ex: 2**3\n")
if "+" in keyin:
    "以+號分割字串,從串列中分別取得兩個數字並貼上標籤,印出兩數相加"
elif "-" in keyin:
    "以-號分割字串,從串列中分別取得兩個數字並貼上標籤,印出兩數相減"  
.
.
.