AoS Warhammer Weekly NPE Discussion

Discussion in 'Seraphon Discussion' started by Carnikang, Feb 11, 2021.

  1. LordBaconBane
    Ripperdactil

    LordBaconBane Well-Known Member

    Messages:
    475
    Likes Received:
    1,242
    Trophy Points:
    93
    Interesting you mention this, I was just mathing this out. Saurus guard have:
    • Higher effective wounds (20 wounds vs 11.62)
    • Gives the benefit of Look Out Sir! where Eternity Warden doesn't
    • More bodies for objectives
    • More damage
    • Upgrades that the Eternity Warden doesn't have
    • The exact same Selfless Protector ability
    But the Eternity Warden has higher point efficiency? Maybe if you compare it only to other heroes. It only has this because of more keywords and a CA according to what they present.

    I might looks at some other factions I'm familiar with, like Skaven, and see if it holds up to similar scrutiny.
     
    Carnikang likes this.
  2. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    Eh, still feel like you're being unnecessarily harsh. An efficient bad unit is still bad... but it is also efficient. So if you're judging everything just on efficiency of raw stats, its an adequate tool.

    Again, it gives tangibility to an argument that otherwise has none. Considering i don't see any kind of factual evidence, real life experience, or data of literally any kind in any of your arguments, its "fine" for the purposes that we need it for.

    It doesnt compare eternity wardens to saurus guard. It compares them to other similarly stated/priced hero models. It's saying "for XX points, you're getting XX% of 'stats' compared to other similar units." If you're going to so heavily criticize something you should at least be judging it accurately.

    Also, dude gets plenty of advice and shares everything on his twitter and is super open about everything. How about if you want to contribute you do the due diligence to help him out? He already went through the effort of building the whole thing, not sure why he should also seek out some random strangers advice on how to make it better.
     
    Last edited: Mar 10, 2021
    LordBaconBane likes this.
  3. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    In general heroes seem to do well according to this tool. There's several factions like SCE & FEC which apparently only have efficient heroes and basicly everything else is not efficient (or well, SCE has 2 or 3 regular "efficient" units v.s. like 8 "efficient" heroes, everything else is inefficient.)

    The tool basicly seems to be biased towards heroes. Which isn't surprising as heroes generally have more abilities, multiple weapons & above average save, number of spellcasts & bravery for their point cost. All of which are important attributes according to this tool. Not to mention the potential effect model count has (again, super biased against heroes & behemoths given those are all relativly expensive and 1 model), though I'd need to see the actual trees to judge how messy it's made things.

    It's a model build on bad data, and if he knows what gradient boosted trees are & how to use those then he should also know how important it is to properly curate your data. So no I'm not.

    Eh, I mean sure? It's also meaningless though, cuz it's still bad. What's important to realize is that it is entirely possible for a unit to be efficient because it "wastes" that efficiency on buying a meaningless stat. The best example of this is probably bravery. A hero, behemoth or MSU of 3-man elite type units having high bravery is basicly pointless in most cases cuz they're not going to be subject to battleshock anyway. So yeah, if you have two otherwise identical heroes, one with a bravery of 10 and one with a bravery of 1, the one with 10 is vastly more efficient. But in practice the actual difference is negliceable since bravery doesn't really matter in most situations...

    It's not fine, it's a bad model cuz based on bad data making it utterly useless.

    Yeah no. Nowhere does it say he build a model for different types of units. Based on what it says their the trees are build on the entire dataset of units, not on subsets according to unit type.

    So yeah, that eternity warden is compared directly to saurus guard.

    Either that, or the description on the site is just wrong. In which case presenting all results in one table is also very bad cuz they're not results from the same model....

    He seems open enough, but there's little to no actual discussion about his data though as far as I can see on his twitter. There's mostly discussion about certain units being better/worse because his model doesn't take into account allegiance abilities, not because what he does take into account is already significantly flawed.

    It doesn't need to be me giving the adivice, plenty of other datascientists/statisticians/mathematicians/etc. that can point out the flaws in his data as well. Probably even more obvious if we could see the actual trees as those relatively human-readable.

    As for seeking him out. Don't particularly feel like it as it's a fairly pointless endavour to build a ML model for this without a good way to represent the effect of the various special utility abilities. And given that the utility abilities vary wildly in scope I don't think you can make a particularly decent representation of those. You're probably better off leaving quite a lot of them out. But since some of these abilities are rather impactfull that'd result a model that's by definition going to be significantly flawed.
     
  4. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    edit: feels like a lot of criticism thats largely unsubstantiated.
     
    Last edited: Mar 23, 2021
  5. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    Dude I've consistently given you explanations and examples whenever I make a claim . The fact that I don't put it in a pretty table with colours doesn't make it unsubstantiated, nor does the fact that this guy throws builds a model on a bunch of nonsense and puts the results in a table make his claims substantiated. I expected more from you....
     
  6. LordBaconBane
    Ripperdactil

    LordBaconBane Well-Known Member

    Messages:
    475
    Likes Received:
    1,242
    Trophy Points:
    93
    - scrub this comment I mucked up.
     
  7. LordBaconBane
    Ripperdactil

    LordBaconBane Well-Known Member

    Messages:
    475
    Likes Received:
    1,242
    Trophy Points:
    93
    Off the top of my head, if I wanted to get efficiency from abilities, you would have to try and find similar abilities from all factions and see which of those models are taken the most.

    The tricky part would be having your ML understanding what exactly an ability is doing so it can know if two abilities are similar. GW doesn't have a consistent naming convention for abilities. (There are some, like terror always seems to be -1 bravery in a range). If you did try and parse abilities, how can it tell +1 to hit and +1 to wound apart, how can it look at a situation and say, "+1 to hit for <faction name> is better than +1 to hit for <type of unit in said faction> in said range". Maybe if you had a hierarchy of keywords, used a language processor to read the sentences, and try to determine what is the most efficient based off of the value from other units?

    I don't think that would work, but if it did it runs the risk of being extremely fragile.
     
  8. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    Except that's not exactly what's going on here, is it? I never said it was the end all be all of all warhammer data. I tried to specifically format my argument to not say that. I just think its wildly unfair to call it "practically worthless" and a "bunch of nonsense in a table." That's just not an accurate description, nor do i think it's fair of all the work put in. I stand by my statement that you're being unnecessarily harsh when it was presented as only "here's a way to comparish raw stats between models." Especially considering your saurus guard/eternity warden judgement isn't even accurate as explained below.

    If you feel like your argument that our warscrolls are trash is supported by better data then i'd love to see it.

    And literally directly from the page that explains how it works:

    "Listbot breaks a warscroll down into a long list of features, such as basic stats, number of special abilities, number of keywords, the presence of certain specific keywords like Monster or Hero, and many more."

    and

    "When evaluating Sykfires, Listbot knows that Tzaangor Enlightened cost 170 points. It then sees that Skyfires are very similar, except they also have a shooting attack*, which it believes increase their value, and bumps their cost up to the 180 points we discussed above."

    So, it does look for keywords and then rates things compared to "very similar" warscrolls that it knows the points and information for already. AKA It wouldn't be comparing saurus guard to the eternity warden. It would be comparing the eternity guard to "a warscroll that's very similar."

    edit: TLDR your comments come across pretty hyperbolic, to the point that they almost become meaningless. For example, the data can be flawed but still useful in some situations. Seraphon can be nerfed without ending up garbage. We can be buff dependent without being "useless" without them. It becomes impossible to have any kind of meaningful conversation if you refuse to budge from these extreme views about seemingly everything.
     
    Last edited: Mar 10, 2021
  9. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    Sadly it is. It is worthless cuz he's feeding badly curated data into an algorithm that isn't a particularly good fit for what he's trying to do in the first place. There isn't a whole lot else to say about that. You don't get half-credit for a bad model.

    This simply means that you have a database, in this database you have a row for each unit and its various attributes.
    These attributes are the lists of features. So for each unit we have the feature "Hero" a hero would have the corresponding value be true, a normal unit would have the value false for that attribute.

    This entire database is then fed into the algorithm to build the actual trees. Both the heroes and the regular units at the same time. This is completly fine by the way, provided there's no sample bias screwing things up in your data. Which shouldn't be the case here.

    It should be noted that no units are directly compared though, the algorithm simply tries to optimize the amount of correct predictions. Essentially it builds a model, predicts the point cost for a unit based on whatever the model happens to be currently (or several units at once, it's unclear if he's using a batch gradient or not but that's a minor detail), sees how wrong it is, then corrects its model and sees if the new model is better. And continues to do that until either it's at max iterations, or until it reaches a (local) optimum.

    This is just completly wrong and shows a lack of understanding of how this algorithm works. Cuz this is distinctly not what it does.

    Also, the asterix on shooting attack refers to the following line on his site:
    This is again very wrong. Trees are so called white boxes and one of the easiest models to read for humans. We can tell exactly why the model thinks skyfire have a point cost of 180 if we had the trees.

    Here's a relativly simple tutorial on boosted trees.

    This is indeed the basic issue. A +1 to hit aura for all saurus has a very different value from a +1 to hit aura for all skinks or a +1 specifically for say saurus knights. Which makes it very difficult to compare abilities, even if technically they have the exact same effect. You can't just put an attribute "+1 to hit aura" into your database which is what you'd ideally want to do cuz it's nice and simple. And putting in an attribute for every possible variant of the buff wouldn't be super usefull.

    On top of that there's another major issue; most abilities are relativly rare; it's not like we have a 100, or even 20, units with the same ability in AoS. The sample size for units having a aura that gives +1 to hit is going to be tiny. So even for the ones that have more or less the same value regardless of what faction has them (like terror) it'd still be liable to be an attribute with no discriminative value or one that leads to overfitting when we try to use it.

    There's also the issue that trees aren't exactly great for what we're trying to learn here as a tree doesn't strictly need to use all available attributes & values in the dataset and we already know that every single attribute is important to determine the final point value. You should probably be using something like a logistic regression, and even then you might get unlucky and end up with a weight of 0 (or even a negative weight) for certain attributes. Or well, I guess you can make it work if you force the boosting tree to have a particular form where the output of each tree is just the value of that attribute and make sure there's a tree for every attribute only containing 1 decisionnode, but at that point you're basicly just doing a lineair regression anyway.
     
  10. Carnikang
    Carnasaur

    Carnikang Well-Known Member

    Messages:
    1,301
    Likes Received:
    3,655
    Trophy Points:
    113
    Though I'm sure there is merit in the system in some fashion, how does this contribute to perceived power and how a community might react to a certain faction/subfaction/list?

    It's a data comparison tool, and I do like the "other lists made" feature of it. That can be helpful, though part of me also sees it as something that can be used by others to create an environment of exclusion based on your "AoS List Analytics".
    Then again, that's already sort of the case, with rankings and statistics. That's just a thought though, I don't have anything really substantial to back that up.

    I ask mostly because I think discussion ABOUT the mechanics of it could be a discussion entirely unto itself elsewhere. Where as how they may affect the game and play experience could be integrated into a discussion about general topic (NPE and by extension Seraphon sources of NPE).
     
    Putzfrau likes this.
  11. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    I brought it up as an argument against the idea that seraphon are NPE because their warscrolls problematically rely on buffs for power. Which creates an NPE scenario that Canas outlined. I was using it as a tool to say that no, seraphon warscrolls aren't actually that bad without buffs, which by extension changes the NPE situation described.

    I'd argue that the NPE comes from base warscrolls that are actually *too* strong (or cheap) given the nature of the buffs they have access too. Or alternatively, buffs that are too strong given the quality of the base warscrolls.


    Are you literally implying the person that made the tool doesnt understand how the tool he made works? What? It feels like you're making some seriously drastic assumptions off a pretty "explain it to a 5-year old" description of how the model works.

    Regardless, whatever helps you sleep at night dude. At least it's something. I'm waiting for documentation on how you arrived at your "seraphon are useless without buffs" professional judgement of the warscrolls.

    Let me know when you have it. Until then i feel pretty fair in my judgement.

    edit: i apologize for updating the content of my posts so often. I try to nip and nuck areas that i feel are unnecessarily rude or argumentative, but i do most of my posting on mobile and sometimes i accidently send before i finish those updates. You can't delete posts on this forum, so this is the situation i'm stuck with.
     
    Last edited: Mar 10, 2021
  12. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    Well his explanation of the algorithm is wrong. His features are poorly picked. His results show various obvious issues. The algorithm he used isn't a particularly good fit for what he's trying to do. All that doesn't exactly inspire confidence in his understanding of the tools he's trying to use. What more would you like him to get wrong before calling him out on it?

    Also, you do realize that doing what he's done isn't some super complicated secret magic right? You don't exactly need to be an expert programmer to fill a CSV-file with data and run an algorithm from an existing library on that using python.
     
    ILKAIN likes this.
  13. Carnikang
    Carnasaur

    Carnikang Well-Known Member

    Messages:
    1,301
    Likes Received:
    3,655
    Trophy Points:
    113
    All a fair way of introduction of it. Like I said, it has its merits as a system, whether or not it's executed properly or not at the time.

    Thank you for bringing it up.

    I tend to agree about how warscrolls and buffs interact creating NPE. Where it turns from just good interaction within a book to outright problematic mechanics is the rub.
    Like has been shown previously in the thread for skinks. The warscrolls itself is okay. But anything in blobs of 40+ is better than okay.
    And then it gets buffs on top of it.
     
    Putzfrau likes this.
  14. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    It takes a lot more effort than just blindly bashing shit or throwing out opinions with little to no supporting information. If you can make the model better, as i mentioned you should do the due diligence and help him out. You seem to have a pretty clear idea of exactly what it's doing poorly, so it shouldn't be much of a stretch that you can provide some kind of help, right?

    I give the guy the benefit of the doubt that he knows what he's doing considering he made the thing. I don't think its magic. I think he understands the mechanics of what it's doing because he made it, and gave us a layman's explanation of what the model is doing. It seems like to me, that you're jumping down his throat based on the limited information we have available to us.

    I also appreciate that we can look at and criticize the internal mechanics because he's literally showing us it (or a peek at it). You can't do that with your judgement that seraphon scrolls are worthless because again, we're all still waiting on whatever documentation, data, personal experience, or whatever that you're using to base your opinions on.

    So i'll ask again. Please present them. Show me the information that is driving your conclusion. Give me something we can react to outside of your speculation or opinion.


    Exactly right on the system. It has merits if you have the proper context/assumptions going into it. I think you could say that about most "data" about this game.

    This game is so damn complicated with so many variables I think truly accurate data is borderline impossible. We do the best with what we have to supplement our own gut feelings, instincts and personal experiences, but even those still need proper context/assumptions.

    If you play Khorne daemons, seraphon are going to feel unbeatable compared to if you play lumineth or IDK. That context is important when presenting overarching judgements about a book.
    Sometimes i think talking about aos in these abstract ways removes too much of the nuance that happens through player agency at the table. At the end of the day its still a game you have to play. I watch good players with bad lists trash bad players with good lists all the time.

    In general, i just wish more people rated player agency as a more important factor in this game than the numbers on any given warscroll.

    But thats kind of a random tangent anyways :)
     
    Last edited: Mar 11, 2021
    Carnikang likes this.
  15. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    Also assuming he's using the current set of available pointcosts as trainingdata then his final model performs extremely poorly on his own trainingdata, which is just weird. And in that case he'd be using a model to validate that the labels of his trainingdata are correct which would just be ridiculous.

    If he's using an older set of available pointcosts as trainingdata, say the GHB 2019, it'd solve the issue of predicting the labels of the trainingdata based on a model trained on said data. But the accuracy would remain absolutly terrible given that the vast majority of his predictions do not correspond to the models actual pointcost (he has very very few units where he predicts "100%").. Plus, it'd be problematic the moment a new feature needs to be introduced, though since he's not representing abilities in a good way to begin with I guess that isn't going to be that much of an issue.

    And if he's not using either then what in the world is his trainingdata.

    The more I look at this the worse it gets.

    I mean, my last four? five? Posts are about the various things he's doing wrong. I've already linked you a tutorial about how gradient boosted trees actually work. What more do you want? An complete undergraduate course on machine learning taught by IBM?

    Like I've pointed out in an earlier post, several of the features that are important are quite difficult to represent in a good way that a machine learning algorithm can actually do something sensible with it. I'm not entirely sure there is a good representation possible which also allows the features to actually have meaningful discriminative power. So I'd rather do something else with my time, though I'm happy to look at any new features he suggest and see if those make more sense.

    Well that at least he deserves credit for, he is definitly trying.

    Do you actually read my posts? Like seriously? I'm constantly giving examples of situations I've encountered, or try to further explain my claim. The post you're quoting here has 4 separate examples of mistakes he's made that cause me to claim he doesn't actually understand what he's doing.
     
    Putzfrau likes this.
  16. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    I'm specifically asking about what's basing your judgement of seraphon on, not your judgement of the listbot model. I have no qualms with your criticism of the model, only your final conclusion (but it's still not what i was referring to in the statement you quoted). In fact, most of what i said was in reference to your seraphon opinions, not your listbot ones.

    Sorry if I didn't make that clear.

    In terms of the questions you had, his email and twitter are both readily available. I imagine you could message him and very quickly get the answer to your more technical questions.

    Regardless, @Carnikang brought up a great point in that this conversation is starting to firmly entrench itself in offtopic territory. that's my bad.

    @Canas feel free to DM me if you'd like to continue the listbot conversation.
     
    Last edited: Mar 11, 2021
    Carnikang likes this.
  17. Canas
    Slann

    Canas Ninth Spawning

    Messages:
    7,040
    Likes Received:
    10,684
    Trophy Points:
    113
    You mean the fact that most of our base units cannot stand up to an equivalent force in a limited local situation? I mean, we just had an entire discussion where, amongst other things, @Erta Wanderer showed that unbuffed we basicly can't reliably beat a unit of palladors with one of our unbuffed units without striking first, and even if we strike first we needed at least a 40 point (or 24%) advantage, and depending on the unit an even larger advantage was needed. And needing a point advantage that large just to win a straightup brawl is imho an indication of bad design?

    I mean, I'm fairly certain I've stated that often enough with the necesary examples by now.
     
  18. Putzfrau
    Skar-Veteran

    Putzfrau Well-Known Member

    Messages:
    2,291
    Likes Received:
    2,914
    Trophy Points:
    113
    That example was heavily slanted in the favor of the palladors. We were using waves of 10 saurus receiving the charge. It doesn't get worse for us in that example.

    A unit of 20 saurus, completely unbuffed with only their club attacks does around 9 damage to palladors assuming they swing first.

    A similarly costed unit of palladors does just over 6 damage to warriors assumimg it swings first and shoots with its pistols.

    Even if only 75% of the saurus can get in, it's pretty damn similar damage.

    150 points of kroxigor does exactly 1 less damage than palladors completely unbuffed, around 5.3. This is only with their clubs, jaws werent included.

    2 razordons do a little worse damage in melee but have a better shooting attack.

    6 rippers do almost exactly the same damage as palladors with only their toad rage buff. Without it they do around 1.5 less damage on average.

    All damage listed is into a 4+ save.

    So did we really show what you said we did? Or is the situation a little more nuanced then "we don't have a single unit that can outfight palladors".

    Edit: just for some more data..

    Carnosaur scar vet with Spear does 9.5

    1 salamander does just over 6 with his combined shooting and melee attack.

    5 knights do again, just over 6.

    All of these numbers are completely unbuffed. However I did assume coalesced for the knights.

    And if you just want one concrete example, let's use razordons. Arguably one of the worst scrolls in our book, completely unbuffed a unit of 2 has comparable damage in melee, better damage at range, slightly worse movement and comparable/slightly worse wound efficiency. In starborne they have better bravery and access to teleport, if slightly less tough to multi damage attacks.

    You want an option to outfight palladors? 2 Razordons do more total damage unbuffed, are 10 points cheaper, and have access to significantly better Allegiance abilities and buffs. I think if anything this conversation goes to show how much fighting first and ensuring positive engagements is super important when it comes to these types of units wailing on each other with nothing else helping beyond the pure efficiency of their warscroll.

    edit: i'm an idiot, razordons don't get more attacks in coalesced lol. editing stuff to reflect that.
     
    Last edited: Mar 25, 2021
    Erta Wanderer and Carnikang like this.
  19. Dread Saurian
    Stegadon

    Dread Saurian Well-Known Member

    Messages:
    909
    Likes Received:
    1,522
    Trophy Points:
    93
    Wanna kill palladors easier than that? Comets call and celestial deliverance do wonders. So does throwing a fucking dinosaur at it.
     
    Carnikang likes this.
  20. LordBaconBane
    Ripperdactil

    LordBaconBane Well-Known Member

    Messages:
    475
    Likes Received:
    1,242
    Trophy Points:
    93
    This reminds me of something I've been thinking about. What if combat was shared? So if my Saurus are fighting clanrats, both the opponent and I roll our combat dice and do damage at the same time. This might not work very well for multi-unit engagements, but i suppose a player could choose to fight with an opponent or save their unit to attack another. It could create dome interesting situations. Maybe units that charge skways do damage first, so it would be like a charge bonus in TW.

    This may go against the simplicity pillar for designing the game, but could be interesting
     
    Putzfrau likes this.

Share This Page