Jump to content


Photo

[SOLVED]Time to deal with the stuttering


  • Please log in to reply
135 replies to this topic

#81 Ascension64

Ascension64
  • Modder
  • 5983 posts

Posted 16 April 2012 - 02:39 AM

To use the BGMain alloc/dealloc, overload the object operator new and operator delete and define as:
void* MyObject::operator new(size_t size) { return ::operator new(size, 0); }
void MyObject::operator delete(void* mem) { return ::operator delete(mem, 0); }
I have described this in ProgramNotes.txt.

For simplicity purposes, I've made everything public in TobEx, at great risk. Anyway, access level is compiler-based and not program-based, so when looking at the disasm, I cannot ever details what is public/protected/private.

--------------
Retired Modder
Note: I do not respond to profile comments/personal messages in regards to troubleshooting my modifications. Please post on the public forums instead.

Baldur's Gate Trilogy-WeiDU and Mods
Throne of Bhaal Extender (TobEx)

Contributions: (NWN2) A Deathstalker (voice acting) - (IWD2) IWD2 NPC Project (soundset editing) - (Misc) SHS PC Soundsets (voice acting)
Legacy: (BG/Tutu/BGT) Beregost Crash Fixer 1.9 (18 Jul 10) - (BG2) Enable conversations with charmed/dominated creatures (18 Jul 10) - (BG2) Experience Corrections (18 Jul 10) - (Misc) Platform Conversion Utility RC2 (13 Feb 10)


#82 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 16 April 2012 - 05:30 AM

Still not clear on why you can't just use Boost::Map directly or something - it's bound to be optimized to death.

Because:
1) the CVariableMap is 8 bytes. If I want to replace it with std::map/hash_map anything else, I'll have to store a pointer to such structure somewhere.
2) If I replace whatever stored in those 8 bytes with a pointer to library implementation of hashtable, I will need to provide that no method will read this memory in any other way. Considering there are a lot of methods which I do not know about it seems pretty impossible.
3) Why optimize any further thing that already works faster than eye can see?
Considering statements above it's easier to re-implement Add/Find functionality than to re-implement the whole interface around a library hashmap.

Ascension64
What do I do with the implemented stuff? I mean should I send it to you or will you re-implement the same thing yourself or do you have more important stuff to work on for now?

Edited by Suslik, 16 April 2012 - 05:31 AM.


#83 Ascension64

Ascension64
  • Modder
  • 5983 posts

Posted 17 April 2012 - 03:42 AM

What do I do with the implemented stuff? I mean should I send it to you or will you re-implement the same thing yourself or do you have more important stuff to work on for now?

Well, you need to make sure it works fully and doesn't crash the game. I imagine you can just fork the github repository and make a branch. Otherwise, you can just diff your work.

--------------
Retired Modder
Note: I do not respond to profile comments/personal messages in regards to troubleshooting my modifications. Please post on the public forums instead.

Baldur's Gate Trilogy-WeiDU and Mods
Throne of Bhaal Extender (TobEx)

Contributions: (NWN2) A Deathstalker (voice acting) - (IWD2) IWD2 NPC Project (soundset editing) - (Misc) SHS PC Soundsets (voice acting)
Legacy: (BG/Tutu/BGT) Beregost Crash Fixer 1.9 (18 Jul 10) - (BG2) Enable conversations with charmed/dominated creatures (18 Jul 10) - (BG2) Experience Corrections (18 Jul 10) - (Misc) Platform Conversion Utility RC2 (13 Feb 10)


#84 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 04:10 AM

Well, you need to make sure it works fully and doesn't crash the game.

I've been playing for a few hours already, it's working great. Absolutely stutter-free, when PartyHasItem() fix is enabled. But I am still unsure if I have done all right according to TobEx rules, conventions and restrictions, so you'd better check this yourself anyway.

Can I just send files I have modified to you directly?

Edited by Suslik, 17 April 2012 - 08:37 AM.


#85 Beleg33

Beleg33

    AKA Adanedhel on G3

  • Member
  • 521 posts

Posted 17 April 2012 - 06:42 AM

I've been following this thread and even though I didn't understand anything I have to say thank you Suslik for looking into the stutter plague of megamods; and thanks to the others who helped him.

I hope whatever solution you came up with is gonna be implemented in TobEx (from what I understand) when it's fully tested. Great job guys!
Random spambot #8434678 said :

you should liquor multiplying great deal supplment your to office apparel predicated copy may possibly be an go through check out this behave as more busy den has an interest in pc


#86 Salk

Salk
  • Modder
  • 1419 posts

Donator

Posted 17 April 2012 - 07:04 AM

Yes, thanks to Suslik and Ascension64 for assisting and supporting him!

#87 -guest-

-guest-
  • Guest

Posted 17 April 2012 - 07:30 AM

Great work indeed, and I just wanted to say there is something, perhaps similar, where party members keep stopping. I think it's to do with them trying to start a conversation but it never triggers, it just keeps stopping their movements. Anyone else know what i'm talking about?

#88 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 07:48 AM

I don't really like the resizing algorithm.

I mean think of what happens when it needs a new slot:
it will grow to twice the size (10/10+10/10) and search for 1/10 of the new size again - 20% of the old size - which means it might need to grow 2 or 3 times until it finds a free slot, doubling memory 3 times. And if the hashcode starts to line up values, i don't want to think what might happen to performance.

Oh well, if you can't use Boot, you can't.

Edited by i30817, 17 April 2012 - 07:55 AM.


#89 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 08:36 AM

I think it's to do with them trying to start a conversation but it never triggers, it just keeps stopping their movements.

Unfortunately, that's another kind of stuttering and my fix does not affect this particular problem. It's script-related, since NPC tries to initiate dialog, but he "has nothing to say" or similar. Though such issues can usually be debugged with ease without digging into the executable, because they are script-related.

i30817

it will grow to twice the size (10/10+10/10) and search for 1/10 of the new size again - 20% of the old size - which means it might need to grow 2 or 3 times until it finds a free slot, doubling memory 3 times.

Umm.. I cannot see your point. Assume that I have a perfectly distributed hash table which is currently of size 1000(for easy calculations). Current load of the table is 99%. 99% load means that there are 10 empty elements and maximum linear chunk of occupied buckets is 1000 / 10 = 100 buckets. It is exactly the case when I enlarge my hashtable. But when its size is doubled, maximum linear chunk will become not 100/2 = 50 as you might have expected, but 2000 / 1010 ~ 2 buckets long! It means that when the size is doubled, maximum search length is reduced 50 times instead of 2. Such dramatic drop in search length insures that the hashtable will never(provided hash funcion is perfect) be enlarged two times in a row.

It also means that the hashtable will be only be resized when it reaches about 99% load(instead of 100% in standard implementation), not 10% as you might have though.

The only difference in my hashtable is that it starts growing just a little earlier(99% instead of 100% load).

Beleg33, Salk
Glad to be of assistance, guys. I hope Asc64 will include my solution to TobEx so that it will be available for everyone. If you are eager to test it yourself, you can grab TobEx.dll + detoured.dll from the first post of this thread.

Edited by Suslik, 17 April 2012 - 09:04 AM.


#90 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 10:02 AM

I see what you mean, i'd forgotten about what resize would really do (rehash the whole table positions) instead of just copy into doubled array.

What was the reason for the persistent slowdowns with the 100% resize? I get it's the linear search of a degenerated hashtable - was it searching the whole table?

It might be "testing" a value that is not on the table, that then is NOT put on it.
So it's a script "peculiarity", exposing a weakness on the find method?

BTW, your "double return", non-initialized nArraySize, and unused sizeBefore look a little... buggy.

Edited by i30817, 17 April 2012 - 10:08 AM.


#91 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 10:14 AM

I see what you mean, i'd forgotten about what resize would really do (rehash the whole table positions) instead of just copy into doubled array.

Of course. Rehashing is imperative, otherwise the comlexity of search would not even change. Rehashing happens about 2-4 times during the whole gameplay session, and when it happens the game lags for about 0.1 seconds, almost unnoticeable.

What was the reason for the persistent slowdowns with the 100% resize? I get it's the linear search of a degenerated hashtable - was it searching the whole table?

Yep. Slowdown was mostly in case when table is about 99.5 percent full, the search degenerated to linear.

It might be "testing" a value that is not on the table, that then is NOT put on it.

Nah, not the reason. It may as well search for an existing key that's simply far away from the place when its hash points. So the default algorithm had to walk O(n) steps until it bumped into it. And partially you are right - in case the value does not exist it had to walk till the first empty space, which can degenerate to O(n) as well.

So it's a script "peculiarity", exposing a weakness on the find method?

Script AND gameplay state when the table is near-full.

Edited by Suslik, 17 April 2012 - 10:17 AM.


#92 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 10:21 AM

From what i saw in the java sources, they try to keep the load factor at 75% for a good tradeoff of wasted space - speed.
How did you calculate 99%? I don't see any reason (if you had extreme bad luck) why the table couldn't fill up completely.

Lol; learned about a new hashing method (2008!)
http://en.wikipedia....pscotch_hashing

Edited by i30817, 17 April 2012 - 10:30 AM.


#93 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 10:34 AM

Also couldn't you optimize the find method a little more?

for(unsigned int i = 0; i < nArraySize; i++)

there could maybe be limited to
for(unsigned int i = 0; i < nArraySize / 10; i++)

like the add method?

Unless the unmarshall and marshall savegame serialization can mess that up?
Also, i'd put the
if(IsEmpty(pArray[index])) break;

before the string equality test (aren't you comparing random memory potentially?)

Edited by i30817, 17 April 2012 - 10:37 AM.


#94 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 11:03 AM

From what i saw in the java sources, they try to keep the load factor at 75% for a good tradeoff of wasted space - speed.
How did you calculate 99%? I don't see any reason (if you had extreme bad luck) why the table couldn't fill up completely.

Yep. 75-80% is about optimal load for such table. But since I do not know exact table load and I cannot directly store it in any way(remember those 8 bytes I have?) the only way to estimate the load is to check the length of linear search assuming that distribution is more or less uniform. So that estimation of length/10 is heuristic and was made in favour of better better size than search complexity, since even with this constant search works quickly enough. And that 99% is an upper boundary of load made with an assumption of perfectly uniform distribution. Considering actually non-uniform distribution the hashtable will resize at actual 80-95% load.

Also couldn't you optimize the find method a little more?

for(unsigned int i = 0; i < nArraySize; i++)

there could maybe be limited to
for(unsigned int i = 0; i < nArraySize / 10; i++)

like the add method?

I've thought about this one too, but I'm not sure. Though I cannot come up with a counter-test to prove this heuristic wrong, I wouldn't risk adding it since performance is quite good already.

Also, i'd put the
if(IsEmpty(pArray[index])) break;

before the string equality test (aren't you comparing random memory potentially?)

Nope. CVariable is initialized with an empty null-terminated string. So it's absolutely correct to compare it to other string. And if you'd taken a look at my IsEmpty() funtion, you'd not ask this ; )
bool IsEmpty(CVariable &var)
{
        return var.name.GetBuf()[0] == 0;
}

Edited by Suslik, 17 April 2012 - 11:05 AM.


#95 Sasha Al'Therin

Sasha Al'Therin
  • Modder
  • 615 posts

Posted 17 April 2012 - 11:14 AM

I stopped following this topic cause you didn't want to listen to me and my suggestion that the problem lay outside of baldur.bcs. I only recently read up on it again. Its interesting to see that you did indeed move away from baldur.bcs to find the real problem.

Congratulations on finding the source. Tho I think the only games that would really need to use a patch to improve performance/storage of variables are overly packed BWP type of games. Smaller sane installs may never encounter the problem.

My working mods:
an AI Party Script for BG2 game engine DOWNLOAD LINK ONLY!
Interactive Tweaks for BG series with some IWD support. DOWNLOAD LINK ONLY!
Rest For 8 Hours an IWD mod
-------------------------------------------
My contributions: BG1Fixpack, BG1Tweaks
On Hold: Solestia an NPC for SOA
-------------------------------------------
My website: http://sasha-altheri...s.com/index.htm


#96 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 11:21 AM

I stopped following this topic cause you didn't want to listen to me and my suggestion that the problem lay outside of baldur.bcs. I only recently read up on it again. Its interesting to see that you did indeed move away from baldur.bcs to find the real problem.

Why? It was exactly my test baldur.bcs stuttering, because I have filled it with tons of GetGlobal's that started to be executed slowly due to unoptimized hash table inside the infinity engine. My assumption was right almost from the beginning.

Tho I think the only games that would really need to use a patch to improve performance/storage of variables are overly packed BWP type of games. Smaller sane installs may never encounter the problem.

In vanilla game somewhere around the Underdark I have about 700 global variables. The first wave of stuttering starts when the number of global variables will get close to 1024. I think you can figure out the rest yourself.

But thanks anyway.

Edited by Suslik, 17 April 2012 - 11:27 AM.


#97 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 11:34 AM

Even if it's null terminated, comparing for null before doing the string equality test will save on a constant string equality test on early return (and doesn't make a difference for the normal case).

Unless you want to be able to "add" empty string mappings - is this even possible now on the original game?

Edited by i30817, 17 April 2012 - 11:34 AM.


#98 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 11:42 AM

Even if it's null terminated, comparing for null before doing the string equality test will save on a constant string equality test on early return (and doesn't make a difference for the normal case).

I'm sort of puzzled. The algorithm originally was implemented really poorly which caused its performance to degrade hundredfold. I have implemented just a normal version of it which has only one purpose - to negate this degradation. Why do you keep trying to find flaws in it? It's not supposed to be super-optimized performance monster, it just should not lag, lol.

If you wish, I would glad to challenge you in any ACM contest or implementation of a standard but more complicated algorithm such as red-black tree. I really doubt that the regular hashmap I've implemented really deserves to be discussed in such thorough way.

Edited by Suslik, 17 April 2012 - 11:49 AM.


#99 i30817

i30817
  • Member
  • 611 posts

Posted 17 April 2012 - 11:52 AM

If you can't take the criticism, don't display the code.

It's also just a piece of code, not your master work ffs. Don't be so sensitive.

Edited by i30817, 17 April 2012 - 11:54 AM.


#100 Suslik

Suslik

    Investigator

  • Member
  • 500 posts

Posted 17 April 2012 - 11:56 AM

If you can't take the criticism, don't display the code.

I'm not like.. forcing you to use it, you know?.. I just wanted to help. And you are welcome to re-implement it in a proper way.

Edited by Suslik, 17 April 2012 - 11:58 AM.