Skip to content

nakjun/python-data-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๋‹ค๊ธฐ๋Šฅ ํŒŒ์ผ ํŒŒ์„œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๐Ÿš€

์ด ํ”„๋กœ์ ํŠธ๋Š” ๋‹ค์–‘ํ•œ ํŒŒ์ผ ํ˜•์‹์„ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•œ Streamlit ๊ธฐ๋ฐ˜ ์›น ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ž…๋‹ˆ๋‹ค. PPT์—์„œ PDF๋กœ์˜ ๋ณ€ํ™˜, ์ด๋ฏธ์ง€ ๋ถ„์„, OCR ๋“ฑ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ ๐ŸŒŸ

  • PPT๋ฅผ PDF๋กœ ๋ณ€ํ™˜: PowerPoint ํ”„๋ ˆ์  ํ…Œ์ด์…˜์„ PDF ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • PDF๋ฅผ ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜: PDF ํŒŒ์ผ์˜ ๊ฐ ํŽ˜์ด์ง€๋ฅผ ๊ฐœ๋ณ„ ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฏธ์ง€ ๋ถ„์„: ๊ณ ๊ธ‰ Vision-Language ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„์„ํ•˜๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
  • OCR (๊ด‘ํ•™ ๋ฌธ์ž ์ธ์‹): PDF ๋˜๋Š” ์ด๋ฏธ์ง€์—์„œ ํ…์ŠคํŠธ๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  • TXT๋ฅผ PDF๋กœ ๋ณ€ํ™˜: ํ…์ŠคํŠธ ํŒŒ์ผ์„ PDF ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • PDF๋ฅผ HTML๋กœ ๋ณ€ํ™˜: PDF ํŒŒ์ผ์„ HTML ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  • PDF์—์„œ ์ด๋ฏธ์ง€ ์ถ”์ถœ: PDF ํŒŒ์ผ์— ํฌํ•จ๋œ ์ด๋ฏธ์ง€๋ฅผ ์ถ”์ถœํ•˜๊ณ  OCR์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์„ค์น˜ ๋ฐฉ๋ฒ• ๐Ÿ“ฆ

  1. ์ €์žฅ์†Œ๋ฅผ ํด๋ก ํ•ฉ๋‹ˆ๋‹ค:

    git clone https://github.com/nakjun/python-data-parser.git
    
  2. ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค:

    pip install -r requirements.txt
    
  3. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค:

    streamlit run main.py
    streamlit run main.py --server.port 9999 # ์›ํ•˜๋Š” ํฌํŠธ๋กœ ๋ณ€๊ฒฝ๊ฐ€๋Šฅ
    

์‚ฌ์šฉ ๋ฐฉ๋ฒ• ๐Ÿ–ฅ๏ธ

  1. ์›น ๋ธŒ๋ผ์šฐ์ €์—์„œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์—ฝ๋‹ˆ๋‹ค.
  2. ์‚ฌ์ด๋“œ๋ฐ”์—์„œ ์›ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
  3. ์ง€์‹œ์— ๋”ฐ๋ผ ํŒŒ์ผ์„ ์—…๋กœ๋“œํ•˜๊ณ  ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
  4. ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•˜๊ณ  ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์—ฌํ•˜๊ธฐ ๐Ÿค

ํ”„๋กœ์ ํŠธ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์œผ์‹ ๊ฐ€์š”? ํ›Œ๋ฅญํ•ฉ๋‹ˆ๋‹ค! ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ผ์ฃผ์„ธ์š”:

  1. ์ด ์ €์žฅ์†Œ๋ฅผ ํฌํฌํ•ฉ๋‹ˆ๋‹ค.
  2. ์ƒˆ ๋ธŒ๋žœ์น˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค (git checkout -b feature/Features).
  3. ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ์ปค๋ฐ‹ํ•ฉ๋‹ˆ๋‹ค (git commit -m 'Add some Features').
  4. ๋ธŒ๋žœ์น˜์— ํ‘ธ์‹œํ•ฉ๋‹ˆ๋‹ค (git push origin feature/Features).
  5. Pull Request๋ฅผ ์—ด์–ด์ฃผ์„ธ์š”.

์—ฐ๋ฝ์ฒ˜ ๐Ÿ“ง

ํ”„๋กœ์ ํŠธ ๊ด€๋ฆฌ์ž - โœ‰๏ธ njsung1217@gmail.com


โญ๏ธ ์ด ํ”„๋กœ์ ํŠธ๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‹ค๋ฉด ์Šคํƒ€๋ฅผ ๋ˆŒ๋Ÿฌ์ฃผ์„ธ์š”!

About

python-data-parser with streamlit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages