zero-shot

<맨 위로>

zero-shot이 뭘까?
어떻게 unssen을 classification 할 수 있는걸까?
references

zero-shot이 뭘까?

위키에서는 다음과 같이 정의되어있다.(https://en.wikipedia.org/wiki/Zero-shot_learning)

Zero-shot learning (ZSL) is a problem setup in machine learning, where at test time, a learner observes samples from classes, which were not observed during training, and needs in order to predict the class they belong to

보통 training example 수가 적을 때 사용하는 방법이라고한다.
zero-shot의 가장 큰 점은 train set에 없는 unseen class를 예측한다는 점이다.
일반적으로는 unseen은 unseen이다. 가르친적이 없기때문에. 그러나 zero-shot은 이를 구체적인 class prediction을 한다는 것이다! 주변에서 쉽게 찾아볼 수있는건 Google 번역이 있다. 16개의 언어를 시작으로 다른 언어 간 번역을 기반으로 시스템에게 학습되지않은 타 언어 간 번역을 진행할 수 있다고한다.

어떻게 unssen을 classification 할 수 있는걸까?

여러가지 접근 방법이 존재한다(논문들도 다양하고..). 이 포스트 에 의하면 다음과 같은 유망한 접근 기술이 존재한다고 한다.
아직 너무어렵다.. 한국말인데 뭐라는지 모르겠다..ㅠ

의미적으로 인접한 seen data의 통계 지식을 바탕으로 classification model 구축
class에 대한 설명에서 object의 semantic infomation을 추출해서 사용
class에 대한 설명이 있는 경우, word vector 등을 이용해 semantic information을 추출하여 source class로 부터 target class를 추론

개인적으로 이 논문: A review of Generalized Zero-Shot Learning Methods 이 조금 간단하고 명확하게 설명하고 있는 것같다.
위 논문에서는 generative-based methods와 embedding based method 2가지를 말하고 있다.

그림과 함께 이해하는 것이 빠르다.

2022-03-23-zero-shot-img-1
generative-based method 같은 경우는 generative model로 생성한 unseen data를 training set에 포함시켜 train 하여 unseen prediction도 가능케한다.

2022-03-23-zero-shot-img-2
embedding-based method는 동일 class에 대한 visual feature를 semantic embedding space로 가깝게 맵핑한다.
해당 space에 unseen visual feature를 projection 했을 때 semantic information을 unseen class로 구분한다.

위 두 방법 모두 vectorization을 우선 하고 이를 seen data와 결합하여 진행하는 식으로 했다.~~이미지 자체를 학습할 순없으니까 당연한건가 2 멍청2!~~

… 처음에는 굉장히 좋은 방법이라 생각하고 탐구했지만,,, 이 방법을 조금 알고보니까 내가 진행하기에는 심하게 해비한 느낌이다. 그래도 엄첨 좋은 알고리즘인 것 같다 :)

references

https://m.blog.naver.com/with_msip/221886769247
https://deep-learning-study.tistory.com/873
논문스터디 Week 8-9 Zero-shot Learning Through Cross-Modal Transfer