Adds promptfoo eval harness for agent quality scoring. LLM-as-judge system scoring task completion, instruction adherence, identity consistency, deliverable quality, and safety. Includes tests.
7 lines
63 B
Plaintext
7 lines
63 B
Plaintext
node_modules/
|
|
dist/
|
|
.promptfoo/
|
|
results/latest.json
|
|
*.log
|
|
.env
|