python内存管理

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "python内存管理"

Transcription

1 Python 级内存管理 - xiaorui.cc

2 Object-specific allocators [ int ] [ dict ] [ list ]... [ string ] Python core +3 <----- Object-specific memory -----> <-- Non-object memory --> [ Python's object allocator ] +2 ####### Object memory ####### < Internal buffers > [ Python's raw memory allocator (PyMem_ API) ] +1 <----- Python memory (under PyMem manager's control) > [ Underlying general-purpose allocator (ex: C library malloc) ] 0 < Virtual memory allocated for the python process > ========================================================================= [ OS-specific Virtual Memory Manager (VMM) ] -1 <--- Kernel dynamic storage allocation & management (page-based) ---> [ ] [ ] -2 <-- Physical memory: ROM/RAM --> <-- Secondary storage (swap) -->

3 * Request in bytes Size of allocated block Size class idx * * * * * * * * * * * * * * */

4 名词解释 process heap Arenas Pool UsedPools FreePools

5 method posix malloc python memory pool object buffer pool

6 Arena Process Pool stack heap Arena UserPool Free Block Free Block Use Block bss init data malloc heap & pool FeeePool Pool text Pool Headers Pool No BLock Pool

7 userdpool design UserdPools Pool Header Free Block Free Block Pool Free Block Free Block Use Block Header 分配回收 同 个 Pool 下 Block 样长单 Pool 为 4kb Block 及 Pool 都为单链表

8 free pool desgin FeeePool Pool Pool Headers Pool Pool No BLock Pool Headers Pool 为 4kb 小 Pool 清理 Headers No BLock

9 where store variable? run-time Stack heap list [1,2, 3] dict { n : 1 } int 1

10 why? In [1]: a = 123 In [7]: a = 'n' In [2]: b = 123 In [8]: b = 'n' In [3]: a is b Out[3]: True In [9]: a is b Out[9]: True In [4]: a = 1000 In [10]: a = "python" In [5]: b = 1000 In [11]: b = "python" In [6]: a is b Out[6]: False In [12]: a is b Out[12]: True

11 why? In [10]: a = b = 'nima' In [1]: def go(var): In [11]: b = a...: print id(var) : In [12]: a is b 只有引用? Out[12]: True In [2]: id(a) Out[2]: In [13]: b = 'hehe' In [3]: go(a) In [14]: a is b Out[14]: False

12 python objects stored in memory? names Python Has Names, Not Variables!!! names object

13 整数对象池 小整数 整数 var_1 var_2 28 bytes 解释器初始化 var_3 var_4 not the same addr! the same addr!

14 整数对象池 Block List PyIntBlock PyIntBlock 不会归还给 Arena 和 os!!! Free List PyIntBlock PyIntBlock

15 字符对象池 a b c d var_1 var_2 单个字符 38 bytes 由解释器初始化 the same addr!

16 字符串对象池 aa en cao oh woyao buyao kuai feile ref hash 存储变量 var_1 var_2 共用地址 记录引用计数

17 PyObject_GC_TRACK func: PyList_New PyGC_Head Node Node func: list_dealloc ref:

18 ref count 300 x = 300 y = x z = [x, y] ref += 1 X ref += 1 y ref += 2 Z References -> 4!

19 What does del do? x = y = x del x ref -= 1 X y The del statement doesn t delete objects. References -> 1! removes that name as a reference to that object reduces the ref count by 1

20 ref count case def go(): w = 300 ref count +1 go() w is out of scope; ref count -1 a = fuc. del a del a; ref count -1 b = en, a 重新赋值 ; ref count -1 b = None

21 class Node: cyclical ref def init (self, va): self.va = va def next(self, next): Mid self.next = next if del mid node: mid = Node( root ) how? left = Node( left ) right = Node( right ) left right mid(left) left.next(right) right.next(left)

22 mark & sweep gc root b a R c w K G

23 分代回收 PyGC_Head Young node node node node node node node 分 治之 Old node node node node node node 提 效率 命周期 空间换时间 Permanent node node node node node node

24 when gc import gc gc.set_threshold(700, 10, 5) 计数器? 700? PyMemApi 分配计数器 10? 5? 0 代回收 > 代回收 N % 10 2 代回收 N % 5

25 summery 分配内存 -> 发现超过阈值了 -> 触发垃圾回收 -> 将所有可收集对象链表放到 起 -> 遍历, 计算有效引用计数 -> 分成有效引用计数 =0 和有效引用计数 > 0 两个集合 -> 于 0 的, 放 到更老 代 -> =0 的, 执 回收 -> 回收遍历容器内的各个元素, 减掉对应元素引用计数 ( 破掉循环引用 ) -> 执 -1 的逻辑, 若发现对象引用计数 =0, 触发内存回收 -> python 底层内存管理机制回收内存

26 weakref 弱引用 class Expensive(object): def del (self): 不参与引用计数 print '(Deleting %s)' % self 解决循环引用 obj = Expensive() r = weakref.ref(obj) del obj print 'r():', r() class Parent(object): def init (self): self.children = [ Child(self) ] class Child(object): def init (self, parent): self.parent = weakref.proxy(parent)

27 可变 vs 不可变 (obj) string list int dict tuple

28 container objects a = [10, 10, 11] b = a PyObject Type integer PyListObject Type list rc rc 2 value 10 items size 11 PyObject Type integer rc 1 value 11

29 copy.copy a = [10, 10, [10, 11] ] b = copy.copy(a) PyObject PyListObject Type integer Type list rc 1 items rc 2 value 10 size ref PyListObject PyListObject PyObject Type list rc Type integer items 10 rc 1 size ref value 11

30 copy.deepcopy a = [10, [ 10, 11 ] ] b = copy.deep(a) PyObject PyListObject Type list rc 1 items size 10 ref PyListObject Type integer rc 2 value 10 PyListObject Type list rc 1 items 10 PyListObject PyObject Type integer rc 1 size ref value 11

31 diy gc import gc import sys gc.set_debug(gc.debug_stats gc.debug_leak) a=[] b=[] a.append(b) print 'a refcount:',sys.getrefcount(a) # 2 print 'b refcount:',sys.getrefcount(b) # 3 del a del b print gc.collect() # 0

32 Garbage Collector Optimize memory bound 可以降低 threshold 来时间换空间 cpu bound 提 threshold 来空间换时间 暂停 gc, 引 master worker 设计

33 Q & A 引用计数跟 gil 的影响? gc 是否是原? gc 的 stop the world 现象?

34 END xiaorui.cc