
Similar documents

Microsoft Word - KSAE06-S0262.doc


Mechanical Science and Technology for Aerospace Engineering October Vol No. 10 Web SaaS B /S Web2. 0 Web2. 0 TP315 A

IP TCP/IP PC OS µclinux MPEG4 Blackfin DSP MPEG4 IP UDP Winsock I/O DirectShow Filter DirectShow MPEG4 µclinux TCP/IP IP COM, DirectShow I

symmetrical cutting patterns with various materials for visual designing; ii. This part combined costumes, bags and oilpaper umbrellas with the tradit


2 3. 1,,,.,., CAD,,,. : 1) :, 1,,. ; 2) :,, ; 3) :,; 4) : Fig. 1 Flowchart of generation and application of 3D2digital2building 2 :.. 3 : 1) :,

(Pattern Recognition) 1 1. CCD


SVM OA 1 SVM MLP Tab 1 1 Drug feature data quantization table




Microsoft Word - 發布版---規範_全文_.doc

概 述 随 着 中 国 高 等 教 育 数 量 扩 张 目 标 的 逐 步 实 现, 提 高 教 育 质 量 的 重 要 性 日 益 凸 显 发 布 高 校 毕 业 生 就 业 质 量 年 度 报 告, 是 高 等 学 校 建 立 健 全 就 业 状 况 反 馈 机 制 引 导 高 校 优 化 招




<4D F736F F D2040B9C5B871A661B0CFABC8AE61C2A7AB55ACE3A8735FA7F5ABD8BFB3B9C5B871A661B0CFABC8AE61C2A7AB55ACE3A8732E646F63>


27 i

% % ,542 12,336 14,53 16,165 18,934 22,698 25, ,557 7,48 8,877 11, 13,732 17,283 22,









第一節 研究動機與目的





Microsoft Word - 目录.doc


19q indd
















1. 本文首段的主要作用是 A. 指出 異蛇 的藥用功效 說明 永之人爭奔走焉 的原因 B. 突出 異蛇 的毒性 為下文 幾死者數矣 作鋪墊 C. 交代以蛇賦稅的背景 引起下文蔣氏有關捕蛇的敘述 2. 本文首段從三方面突出蛇的 異 下列哪一項不屬其中之一 A. 顏色之異 B. 動作之異 C. 毒性之




Microsoft Word - edu-re~1.doc

Microsoft Word - 08 单元一儿童文学理论


Microsoft Word 一年級散文教案.doc


第32回独立行政法人評価委員会日本貿易保険部会 資料1-1 平成22年度財務諸表等

項 訴 求 在 考 慮 到 整 體 的 財 政 承 擔 以 及 資 源 分 配 的 公 平 性 下, 政 府 採 取 了 較 簡 單 直 接 的 一 次 性 減 稅 和 增 加 免 稅 額 方 式, 以 回 應 中 產 家 庭 的 不 同 訴 求 ( 三 ) 取 消 外 傭 徵 費 6. 行 政 長

(f) (g) (h) (ii) (iii) (a) (b) (c) (d) 208

Microsoft Word - 专论综述1.doc

5% 14A 0.1% 5% 14A 2

[1] [4] Chetverikov Lerch[8,12] LeaVis CAD Limas-Serafim[6,7] (multi-resolution pyramids) 2 n 2 n 2 2 (texture) (calf leather) (veins)


Microsoft Word - 33-p skyd8.doc

2 ( ) ,,,,, ( ),, ;, ( ) ; ; 2. (1), (2), (3) ,,

[1] Liu Hongwei,2013, Study on Comprehensive Evaluation of Iron and Steel Enterprises Production System s Basic Capacities, International Asia Confere

Microsoft Word 年第三期09


Microsoft Word - netcontr.doc

致 谢 本 人 自 2008 年 6 月 从 上 海 外 国 语 大 学 毕 业 之 后, 于 2010 年 3 月 再 次 进 入 上 外, 非 常 有 幸 成 为 汉 语 国 际 教 育 专 业 的 研 究 生 回 顾 三 年 以 来 的 学 习 和 生 活, 顿 时 感 觉 这 段 时 间 也

Microsoft Word - 口試本封面.doc


P(x,y) P(x-1,y) P(x,y-1) P(x,y+1) P(x+1,y) Sobel LaplacePrewittRoberts Sobel [2] Sobel [6] 0 1 1: P(x,y) t (4-connectivity) 2: P(x,y) t 3:


Thesis for the Master degree in Engineering Research on Negative Pressure Wave Simulation and Signal Processing of Fluid-Conveying Pipeline Leak Candi

(1) (%) (1%) (1%) (1%) (1%) (1%) (1%) - 2 -




相 關 技 術, 在 裝 置 上 創 造 出 一 個 令 人 驚 豔 虛 擬 的 幻 境 ; 除 此 之 外, 還 能 與 虛 擬 出 來 的 物 件 進 行 互 動, 已 陸 陸 續 續 被 應 用 在 教 育 研 究 娛 樂 生 活 等 各 個 方 面 認 知 風 格 (Cognitive St


ENGG1410-F Tutorial 6






L A TEX 1968 [1]

InftyReader[2] [3] [4] 5 10 [4, 5] [6] [7] [7] [7] [7] [7] Zernike [7] [1] [1] [1] [7] [7] [8] [6] [1] [9] [10] [11] [12] : Y = 0.309R + 0.609G + 0.082B

R G B Y Y = 316R + 624G + 84B 1024 A: Y = 255 (255 316R + 624G + 84B ) A 1024 255 8- [13] 8- [13] kfill [14] Otsu [12] Otsu

t 0 255 t n (t) 1 µ (t) 1 t n (t) 2 µ (t) 2 n = n (t) 1 + n (t) 2 µ σt 2 = n(t) 1 n 1 µ) 2 + n(t) 2 (µ(t) 2 µ) 2 (µ(t) n t σ 2 t t Sauvola [16] Sauvola (x, y) ω m(x, y) s(x, y) t(x, y) = m(x, y)(1 + k( s(x,y) 128 1)) k [0.2, 0.5] t(x, y) [17] 8 8 ( ) 1. i 1,, i s i 1 {i 1, i 2 },, {i 1, i s }

2. 1 1. 2. m n m = n = 3 i = 1,, m; j = 1,, n (i, j) N ij A ij d ij = N ij A ij ( ) d ij m n r=1 s=1 drs m n

R (i, j) M ij = x i y j (x,y) R (i, j) µ ij = (x,y) R (x x) i (y ȳ) j x = M 10 M 00, ȳ = M 01 M 00 (i, j) η ij = µ ij µ i+j 2 +1 00 Hu [12] (a ij ) m n ( n k=1 a ik) m 1 ( m k=1 a kj) 1 n 3. Hausdorff [1] 2

90% 4 5 5 4 70% 1 ( ) [ ] { } {}}{ }{{} Hausdorff.... L A TEX TEX [18] 1 π 4 0.7854

L A TEX MathML 1 1

1. 2. (a) (b) (c) (a) (b) (d) i.

ii. iii. iv. (e) 3. PostScript PDF MathOCR[19] Java [20] GNU GPL 3

AMSFonts[21] CMB10 CMBSY10 CMEX10 CMMI10 CMMIB10 CMR10 CMSY10 MSAM10 MSBM10 RSFS10 Java TrueType FontForge[22] TTF L A TEX MathOCR 40 Sauvola 1 40 MathOCR 2 1 1

1: 40 30 20 10 CMB10 123/125 106/125 97/125 34/125 CMBSY10 126/126 120/126 104/126 55/126 CMEX10 95/95 82/95 77/95 52/95 CMMI10 110/110 98/110 77/110 3/110 CMMIB10 111/111 99/111 69/111 25/111 CMR10 126/128 99/128 62/128 5/128 CMSY10 125/126 115/126 106/126 29/126 MSAM10 114/114 105/114 93/114 24/114 MSBM10 85/85 80/85 68/85 3/85 RSF10 26/26 25/26 18/26 0/26 1041/1046 929/1046 771/1046 230/1046 2: 39 30 76.9% 13 11 84.6% 23 19 82.6% 22 15 68.2% 24 16 66.7% 20 14 70.0% 35 30 85.7% 147 113 76.9%

MathOCR 1 2 1: MathOCR 2: MathOCR

[1],. [M]. :, 2010. [2] Science Accessibility Net. InftyReader-Top Page[EB/OL]. [2014-08-17]. http://www. sciaccess.net/en/inftyreader/.

[3]. ( )[EB/OL]. [2014-08-17].http://www.saqtech. com.cn/saq_document01.asp. [4],,. [C]//. : 2002:31-37. [5],. [C]//2001. : 2001:69-74. [6] Chan K F, Yeung D Y. Mathematical expression recognition: a survey[r]. Hong Kong: HKUST, 1999. [7] Trier Ø D, Taxt T, Jain A K. Feature extraction methods for character recognition - A survey[j]. Pattern Recognition, 1996, 29(4):641-662. [8] Malon C, Suzuki M, Uchida S. Support Vector Machines for Mathematical Symbol Recognition[C]// Structural, Syntactic, And Statistical Pattern Recognition, Proceedings. Berlin: SPRINGER-VERLAG, c2006 : 136-144. [9] Raja A, Rayner M, Sexton A, Sorge V. Towards a parser for mathematical formula recognition[c]// Mathematical Knowledge Management, Proceedings. Berlin: SPRINGER-VERLAG, c2006 : 139-151. [10] Eto Y, Suzuki M. Mathematical formula recognition using virtual link network[c]// Proceedings of Sixth International Conference on Document Analysis & Recognition. Washingington: IEEE Computer Society, c2001 : 762-767. [11],,,. [J]., 2008, 44(16):18-26. [12] Burger W, Burge M J ;. : Java [M]. :, 2010. [13]. [D]. :, 2009. [14] O Gorman L. Image and Document Processing Techniques for the RightPages Electronic Library System[C]// Proceedings of the International Conference on Pattern Recognition. Los Alimitos: IEEE, c1992: 260-263.

[15] Shapiro L G, Stockman G C ;,,. [M]. :, 2005. [16] Sauvola J, Pietikäinen M. Adaptive document image binarization[j]. Pattern Recognition, 2000, 33(2):225-236. [17] Shafaita F, Keysersa D, Breuel T M. Efficient Implementation of Local Adaptive Thresholding Techniques Using Integral Images[C]// Proceedings of The International Society for Optical Engineering, Document Recognition and Retrieval XV. San Jose: SPIE-IS&T, 2008:681510 681510 6. [18] Knuth D E. The TEXbook[M]. Reading: Addison-Wesley, 1986. [19]. MathOCR[CP/DK]. [2014-11-29].http://mathocr.sourceforge.net/. [20] Open JDK 1.7.0_55[CP/DK]. [2014-5-16].http://openjdk.java.net/. [21] American Mathematical Society. AMSFonts 3.04 [CP/DK]. [2013/01/14].http://www.ctan. org/pkg/amsfonts. [22] Williams G. FontForge[CP/DK]. [2012-07-31]. http://fontforge.org/. An Attempt on Printed Mathematical Formula Recognition Abstract: Since optical formula recognition should be an essential part of a document analysis system but it is missing from most main-stream systems in reality, a practical solution to formula recognition is proposed. Like most existing designs, the system consist of two main parts: symbol recognition and structural analysis. For the character recognition part, the core is a glyph recognizer, coarse classification is followed by fine classification to produce candidates, and then template matching based on Hausdorff distance is being used to verify. Dynamically generated template is used to match special glyph. Empirical rules is also being used to match line and dot. Later on, some glyphs is combined to form symbol according to their recognition result and coordinates. For the structural analysis part, a bottom-up approach is applied. Scripts, fractions, radical expressions, matrices and multi-line expressions are supported, further extension is also possible. An implementation based on ideas presented in this article, MathOCR, is already available as a free software.

Although the system has not yet acquired industrial strength and robustness for daily use, it can produce impressive output using high-quality input. Key Words: optical mathematical formula recognition; structural analysis; optical character recognition