That will produce challenging boards ?
If the base solver you have is a system that can be run in various configurations with different levels of reasoning and assumption as well as a report on the amount of search needed if any, that can be very useful as a way to measure the hardness. In Sudoku as a Constraint problem (https://citeseerx.ist.psu.edu/document?doi=4f069d85116ab6b4c...), Helmut Simonis tested lots of 9x9 Sudoku puzzles against various levels of propagation and pre-processing as a way to measure the hardness of Sudoku puzzles by categorizing them by the level of reasoning needed to solve without search. The MiniZinc model for LinkedIn Queens (https://news.ycombinator.com/item?id=44353731) can be used with various solvers and levels of propagation as such a subroutine.
Now, for production-level puzzle making, such as what King does for Candy Crush, the problems and requirements are even harder. I've heard presentation where they talk about training neural networks to play like human testers, so not optimal play but most human like play, in order to test the hardness level of the puzzles.