I would like to investigate or discuss Claude's ability to have more green cases.
These seem like an outlier, or out of band detection of bullshit.
- Is there detection at the api level: meaning they catch on to your api keys sooner?
- could we use the 100 api keys through 100 residential proxies?
-
Is there a phrasing or patterns in the questions the models have been specifically trained on? meaning: Anthropic developers could have the first set of questions, and a few of their own, and train the models to give the rejected or pushback responses for those patterns.
-
Any other potential success reasons?
I would like to investigate or discuss Claude's ability to have more green cases.
These seem like an outlier, or out of band detection of bullshit.
Is there a phrasing or patterns in the questions the models have been specifically trained on? meaning: Anthropic developers could have the first set of questions, and a few of their own, and train the models to give the rejected or pushback responses for those patterns.
Any other potential success reasons?