인공지능 기술 분류와 평가 척도 AI and evaluation metric

무기체계와 컴퓨터/인공지능과 머신러닝 AI Machine Learning

인공지능 기술 분류와 평가 척도 AI and evaluation metric

xdots 2023. 10. 7. 13:15

인공지능이 가능하게 하는 지적노동의 대량생산

1. 인공지능 분류 (2006년)

20여년전에 그렸던 인공지능 분류이다. 인공지능 영역을 4가지 큰 영역인 이론, 추론, 감각 인공지능 나누어 볼 수 있다. 당시에는 그전에 유행하던 전문가시스템이 성능을 발휘하지 못하던 시기였다.

(이론 인공지능) 지식이나 정보에 존재하는 불확실성을 처리하기 위한 기법과 추론 방법, 목적을 달성하기 위한 계획법과 기계 학습법 등이 포함
(추론 인공지능) 애매성의 논리인 퍼지 이론, 전문가 지식을 활용하는 지식 기반 시스템을 위한 전문가 시스템, 신경망 등이 포함
(감각 인공지능) 인식을 위한 시각 문제, 자연언어 처리를 위한 필요한 방법에 관한 부분이 포함
20여년전 경향은 분산 환경에서 적합한 소프트웨어를 새롭게 만드는 데 필수적인 에이전트 개념과 활용에 관한 부분이 있었다.ㅋ

지능형 성숙도 모델을 이용한 소프트웨어 집약 시스템의 전투실험 프로세스 설계 및 적용, 지능시스템학회 논문지, 2007

2. 인공지능 분류와 평가 척도(2023년)

컴퓨터 비전 Compuer Vision

Depth Estimation Model : N/A
Image classification : Accuracy, Recall, Precision, F1 score
Image Segmentation : Average Precision(AP), Mean Average Precision(mAP), Mean Intersection over Union(IoU), APα(ex AP50, AP75)
Image-to-image: Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM), Inception Score (IS)
Object Detection : Average Precision (AP), Mean Average Precision(mAP), APα(ex AP50, AP75)
Video classification : Accuracy, Recall, Precision, F1 score
Unconditional image generation : The inception score (IS), Fréchet Inception Distance (FID)
Zero shot image classification : top-K accuracy

자연어 처리 Natural Language Processing

Conversational response modelling : BLEU Score(Bilingual Evaluation Understudy Score)
Masked language modeling : Cross Entropy, Perplexity
Question Answering : Exact Match, F1-Score
Sentence Similarity : Reciprocal Rank, Cosine Similarity
Summarization : ROUGE-N
Table Question Answering (Table QA) : Denotation Accuracy
Text Classification : Accuracy, Recall, Precision, F1 score
Text Generation : Cross Entropy, Perplexity
Token classification : Accuracy, Recall, Precision, F1 score
Translation : BLEU, SacreBLEU
Zero-shot text classification

오디오 Audio

Audio classification : Accuracy, Recall, Precision, F1 score
Audio-to-Audio : Signal-to-Noise ratio(SNRI), Signal-to-Distortion ratio(SDRI)
Automatic Speech Recognition (ASR) : Word error rate (WER), Character error rate (CER)
Text-to-Speech (TTS) : Mel Cepstral Distortion (MCD)
Tabular classification : Accuracy, Recall, Precision, F1 score
Tabular regression : Mean Squared Error(MSE), Coefficient of determination (or R-squared)

멀티모달 Multimodal

Document Question Answering : Average Normalized Levenshtein Similarity(ANLS), Exact Match
Feature extraction : N/A
Image to Text : N/A
Text to Image : Inception Score (IS), Fréchet Inception Distance (FID), R-precision
Text-to-video : Inception Score (IS), Fréchet Inception Distance (FID), Frechet Video Distance(fvd), CLIPSIM
Visual Question Answering : Accuracy, wu-palmer similarity

강화학습 모델 Reinforcement Learning

Discounted Total Reward, Mean Reward, Level of Performance After Some Time

인공지능 지표 AI Index

스텐포드 대학의 AI Index Report AI Index Report 2023 – Artificial Intelligence Index The AI Index is an independent initiative at the Stanford Institute for Human-Centered Artificial Intelligence (HAI), led by the AI Index Steering Committee, an i

dase.tistory.com

※ 코드퓨전 척도

CODEFUSION 코드퓨전

마이크로소프트웨어의 copilot에서 코드 생성에 쓰일 수도 있는 논문인 것 같다. arXiv에 최초 공개되었던 표에는 ChatGPT에 파라미터가 20B 이라고 표시했다가 지워졌다는 얘기가 있다. 현재는 arXiv에

dase.tistory.com

※ 분류 메트릭

분류 메트릭 혼동 행렬 (Confusion Matrix)

ㅇ 정확도(Accuracy) 전체 예측 건수에서 정답을 맞힌 비율 일반적으로 사용될 수 있는 척도 전체 예측 건수에서 정답을 맞힌 비율 ㅇ 정밀도(Precision) 맞다고 분류한 건수 중에 실제로 맞는 건수 실

dase.tistory.com

※ 회귀 척도

회귀 척도

◦ (평가 척도) 모델 검증은 실제값과 예측값의 차이를 수치적으로 확인 하기 위해 기준값을 사용 MAE(Mean Absolute Error)는 절대 오차값 평균으로 직관적인 평가 가능하지만 오차의 크기에 대한 민

dase.tistory.com

※ 객체 식별

이미지 객체 식별 성능 지표

◦ (모델 성능 척도) 일반적으로 객체 식별 연구에서 성능을 확인하는 지표에는 모형이 검출한 정보들 중에서 참값(Ground truth)과 일치하는 비율에 대한 정확도의 척도를 나타내는 mAP(mean Average Pre

dase.tistory.com

저작자표시 비영리 변경금지 (새창열림)