Metrics

Introduction As more and more business problems can be solved using large language models (LLMs) in chat or QnA systems, the question of how to evaluate them has become increasingly important. Without proper evaluation, it is difficult to know if the system are providing real value to the business and users, or just misleading them and potentially inflicting harm. ...